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ABSTRACT 


The practical applicability of randomization tests is discussed. The randomization 
test for two independent samples is the specific test examined in both hypothesis and 
significance testing contexts. This test has optimum theoretical properties as a 
nonparametric procedure for comparing the means of two populations. However, the 
calculations that are required to actually use the test in practice can be extremely time 
consuming. Using the randomization test for two independent samples to conduct a 
significance test is shown to be a #P-complete enumeration problem. This implies that 
a computationally efficient way to perform an exact version of the procedure is not 
likely to exist. Two approximate ways to perform the randomization test are studied 
With the aid of a simulation. One method uses a normal distribution to approximate 
the actual randomization distribution and the other method is the usual two sample t- 
test. The t-test is found to yield results very close to those that are obtained from the 


exact randomization test under the conditions studied. 
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I. INTRODUCTION 


Randomization tests have long been recognized as powerful nonparametric 
statistical methods since the introduction of the principal ideas bv R.A. Fisher in 1935. 
Even when compared to the most powerful parametric tests such as the t-test, 
randomization tests perform extremely well. Theoretical work since Fisher’s paper has 
indicated that randomization tests may be the best methods to use in many situations 
involving significance testing or tests of hypotheses. This is particularly true if 
assumptions about the underlying probability distributions are difficult to establish. 

Despite their strong theoretical basis, however, randomization tests have not 
been in widespread use. The major reason they have not been commonly used is 
because they are very tedious to perform. Even when sample sizes are relatively small, 
the computation time required to perform these tests can be significant. While this is 
less of a problem with modern computing equipment, there still exists a point where 
the size of the data sets is large enough to make the procedures impractical. This point 
is reached rapidly due to the inherent combinatorial nature of the algorithms used to 
perform the tests. Vast 1mprovements in computational speed have only a marginal 
effect on the size of the data sets that can be handled. Approximate randomization 
tests have been developed because of these difficulties, but analytic results describing 
the errors involved with their use are limited. Exact analytic results are difficult to 
obtain because the form of the underlying distributions is not known. 

This thesis addresses the issue of practical implementation of randomization tests. 
The randomization test for two independent samples is the specific procedure chosen for 
the entire study. This procedure is representative of randomization tests in general. A 
complete description of this test, along with each assumption needed to ensure its 
validity is given first. Also included is a summary of some of the important theoretical 
work that has been done since the test appeared in the literature. Next, the methods 
available for performing an exact version of this test are shown to require so much 
computation time when the length of the input data sets increases that the methods 
become impractical on even the fastest computers. The mathematical framework 
necessary to prove this result is fully developed using concepts from the theory of #P- 


complete enumeration problems. 


Finally, an approximate method for performing the randomization test for two 
independent samples is described. This method is compared to the exact test and the 
standard t-test, using the same sample values for each. The samples are generated from 
several distributions through standard simulation routines and the performance of each 
test in terms of significance level and average power is recorded. The results from this 
Simulation are discussed, and recommendations are made as to which test should be 


used and when. 
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Il. RANDOMIZATION TEST THEORY 


A. THE RANDOMIZATION TEST FOR TWO INDEPENDENT SAMPLES 


1. Randomization Concept 

The basic idea of randomization was introduced by Fisher in 1935 [Ref. 1]. 
Randomization involves taking precautions in the design and actual performance of an 
experiment to ensure the validity of statistical procedures used on the resulting data. A 
randomized experiment is one in which treatments are randomly assigned within each 
block. Fisher argued that, on the basis of a randomized experiment, it is possible to 
conduct a test of significance without making any assumptions about the distribution 
(a distribution free procedure) [Ref. 2: p. 95]. The idea of using a randomization test 1s 
to perform a hypothesis or significance test involving two or more samples from 
populations whose distribution functions are unknown. The hypotheses of interest 
usually take the form of testing whether or not these distribution functions are all 
identical, except for possibly different location parameters (means, for example). 

2. Test Method 

A randomization test for two independent samples was first proposed by Pitman 
[Ref. 3]. The purpose of this test is to compare the means of two populations. The 
procedure is to draw two random samples X,, » a X,, and Y,, x i oe of sizes n 
and m respectively from two independent populations X and Y. Following the 
description in Conover [Ref. 4: p. 328], independence within each sample is assumed, as 
well as independence between the two samples. Also assumed is that either the two 
population distribution functions are identical, or one population has a larger mean 
than the other. Without this second assumption the test is still valid but might lack 
consistency. The hypothesis to be tested is that the mean of the population from 
which the X’s were drawn ([t,) is the same as the mean of the population from which 
the Y’s were drawn (HY). The (two-tailed) alternative is that the means are not the 


same. In other words, this 1s equivalent to testing 


Fen Sy 
vs. H,: By # HY 


ie 


where H, denotes the null hypothesis and H, is the alternate. This two-tailed test is the 
specific form of the randomization test that will be referred to henceforth. 
An appropriate test statistic that can be used is just the sum of the X 


observations: 


T= Y Xx, (eqn 2.1) 


The critical {or significance ) level of the test is denoted a@. This number is equal to the 
probability that the test statistic could have produced values identical to or more 
extreme than the originally observed value IT). To find a, the null hypothesis H, 1s 
assumed to be true; that is, the X and Y populations are identically distributed. If H, 
is true, then the X’s should have no more of a tendency to be low or high than do the 
Y’s. Essentially, the X’s and Y’s could be thought of as just one collection of n+m 
observations from the same distribution, and each selection of m X observations 
should be considered equally likely from the n+m observations available. 

The significance level @ is obtained by counting the number of ways n of the 
n+m observations may be selected so that their sum is equal to or more extreme than 
the originally observed value of the test statistic T). More extreme means smaller if T, 
is in the lower tail or larger if T, is in the upper tail of the distribution of all possible 
values of the test statistic using the observed data. The number of ways is doubled, 
because the test is two-tailed, and divided by ("7") to yield a. [Ref. 4: p. 329] 

In the case of hypothesis testing, a critical value, say Gp, is specified beforehand 
and the null hypothesis is rejected if @< Gp. If significance testing is being performed, 
the interpretation is somewhat different. In this case, there is no pre-specified value ap. 
The significance level @ 1s computed and if it 1s small, say less than .01, then either the 
observed value of the test statistic happened to be a rare event or the basic premise 
that the X’s and Y’s are identically distributed is unlikely. The smaller @ is, the more 


compelling is the latter event. 


B. THEORETICAL PROPERTIES 
1. Efficiency and Asymptotic Relative Efficiency 
The term efficiency is applied to statistical tests when comparing the sample 
sizes required by two different tests that give comparable results. The power of a test is 


defined to be the probability of rejecting the null hypothesis Hy when it is false. The 
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power of a test depends upon factors such as sample size and the particular alternate 
hypothesis H, chosen. Suppose two tests have the same level of significance and 
power and they can both be used to test a particular H, against a particular alternate 
H,. Then the test requiring the smaller sample size is preferred, because a smaller 
sample size means less cost and effort is required in the experiment. As indicated in 
Conover [Ref. 4: p. 88], the test with the smaller sample size is said to be more efficient 
than the other test. 

Suppose IT, and IT, represent two tests that could be used to test a given Hp 
against a given H,. Suppose further that either test, if used, would yield the same value 
of a and the same power characteristics. Then, adopting Conover’s notation 
[Ref. 4: pp. 88-89], the relative efficiency of T, to T, is the ratio n,/n,, where ”, and n, 
are the sample sizes required by the tests T, and T, respectively in order for each to 
vield identical results. 

The relative efficiency of two tests depends on the particular values chosen for 
@ and power and it also depends on the particular alternate hypothesis H, chosen if H, 
is composite. A composite hypothesis is one that does not specify a probability law 
completely. It would be more useful if an efficiency measure could be developed that 
does not depend on these quantities. Such a measure can be developed in the following 
way. Consider two parallel sequences of tests constructed so that as m, and x, are 
increased, the significance level and power of each pair of tests remains the same. To 
accomplish this, two things would be required. First, as , is increased, the power of 
each test in the first sequence would change if the alternate hypothesis H, were kept 
fixed. To keep the power constant, a different H, could be selected each time. The 
values of @ and power would then remain the same from test to test in the first 
sequence. Second, for each value of nN), a value of n, must be calculated so that each 
test in the second sequence has the same values of @ and power as its corresponding 
test in the first sequence under the alternative hypothesis chosen. Then there is a 
sequence of values of relative efficiency x,/n,, one for each pair of tests in the original 
sequences. If n,/n, approaches a constant as x, becomes large, then that constant 1S 
called the asymptotic relative efficiency (A.R.E.) of the first sequence of tests to the 
second, if the constant is the same for all values of @ and power. 

The A.R.E. is one measure of a test’s performance. For many nonparametric 
tests, the A.R.E. is less than 1.0 when compared to the corresponding parametric tests 


in situations where they are appropriate. This implies that, in general, a nonparametric 
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procedure will require a larger sample size to achieve the same results as a parametric 
procedure if the basic assumptions of the parametric method are valid (e.g., normality). 
However, according to Conover [Ref. 4: p. 327], the A.R.E. of the randomization test is 
1.0 when compared to the most powerful parametric tests in some situations. The 
A.R.E. may be much higher than 1.0 if the basic assumptions of the parametric test are 
not met. Thus a randomization test should be at least as efficient as a parametric test 
and could be more efficient on the basis of asymptotic relative efficiency. Note that, on 
the basis of relative efficiency (not asymptotic), the randomization test might be better 
Or worse than a parametric test depending on the circumstances. Generally, though, 
asymptotic relative efficiency 1s a reasonable and widely accepted measure of a test's 
performance. 
2. Unbiasedness 

The definition of an unbiased test is a test in which the probability of rejecting 
a false Hy is always greater than or equal to the probability of rejecting a true Hy 
[Ref. 4: p. 86]. Another way to state this is to say the power is at least as large as the 
level of significance. This is obviously a desirable property to have; a test should be 
more likely to reject H, when it is false than when it is true. The randomization test 
has been shown to be an unbiased test in Lehmann and other sources [Refs. 5,6]. 

3. Uniformly Most Powerful Test 

The power of a test, denoted by 1—f, is the probability of rejecting a false 
null hypothesis. In the case of a simple alternate hypothesis that specifies a probability 
law completely, this is a unique number. However, in the case of a composite alternate 
hypothesis, the power is not unique. The alternate hypothesis being considered here, 
Hy W, = H,, is composite since there are an infinite number of possible probability 
functions implied by the inequality. When the alternate hypothesis is of composite 
type, power 1s represented by a power function, where the value of power depends on 


the parameters of the alternate probability laws implied by H,. Specifically, 

Power = P(Reject H,| @) (eqn 2.2) 
Where 9= j.- H,. The power function for a two-tailed test of Hy vs. H, has a 
characteristic ‘U’-shape centered at the value See We 


The size of a test is defined to be the maximum probability of a Type I error 


(rejecting the null hypothesis Hj when it is true) over all values of parameters for 
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which Hp is true. Among all tests which have size a, the best test (if it exists) is that 
test which has the largest power over all values for which H, 1s true. Such a test is 
called the uniformly most powerful test of size & [Ref. 7]. Graphically, this implies the 
power function of a uniformly most powerful test will pass through the point @ when 
Ho 


could be used. 


is true and will lie above the power curves of all other possible tests of size a that 


In the case of the randomization test for two independent samples, Oden and 
Wedel [Ref. 5: p. 520] have stated the following for the case of a one-sided alternative 
H,: “Among all unbiased tests for testing H) against H, the test is uniformly most 
powerful for the subclass of H, with elements ( f, g ) such that In ( //g ) is linear, 
including e.g. the case of ‘normality and equal variances’.” The extension to a two- 
sided alternative is readily apparent. Here, f and g are one-dimensional probability 
density functions that belong to the class of functions associated with H,. An example 
of densities f and g that satisfy such conditions would be two standard exponential 
density functions with parameters 4, and d,, respectively. 

This 1s a very significant result. The fact that the randomization test for two 
independent samples is the uniformly most powerful test against a certain subclass of 
alternatives is strong theoretical justification for use of the test in many circumstances. 
When the other desirable properties of the test mentioned previously are also 
considered, the implication is that the randomization test should be preferred over any 
other method of comparing means unless underlying distributions can be clearly 


justified. 
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II. COMPUTATIONAL ISSUES 


A. COMBINATORIAL NATURE OF THE RANDOMIZATION TEST 
1. Rapid Growth of Combinations 

Even though the randomization test for two independent samples has been 
shown to possess many desirable properties, the test is not encountered very often in 
practice. The basic reason is the amount of computation time required to perform the 
test. In the previous chapter, the test method was shown to be essentially a counting 
procedure involving combinations of the data. The number of combinations possible of 
n+m objects taken # at a time is (’77’"), and this number grows at a substantial rate as 
n and m are increased. The following table illustrates the growth of combinations for 


some selected values of m and mn: 


TABLE 1} 
COMBINATIONS 


MN 
2 
) 
7 
9 
11 
15 


No -_— 





There is no known way to perform the exact randomization test for the 
general case other than enumerating all possible combinations of the data (or at least a 
fair proportion of them) and comparing each one to the original test statistic T,. In 
certain special cases, more efficient methods do exist. For an example of such a method 
see Soms [Ref. 8]. It is possible to reduce the number of combinations that need to be 
considered through the use of more intelligent enumeration schemes, backtrack search 
or other techniques. However, even though considerable savings could be achieved, the 
number of combinations remaining continues to grow at a rate proportional to total 
enumeration. Thus the computation time required to perform the general 


randomization test is a function of the number of combinations involved. 
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2. Computer Time Considerations 

As an example of how rapidiv enumeration becomes untenable, consider a 
computing device capable of generating combinations of data sequentially and 
comparing each one to a fixed value. Assume that each combination could be formed, 
compared, and counted in a time span of | microsecond (this is very fast, even for a 
large computer). Also assume it is desired to use this device to perform the 
randomization test on samples of sizes up to n= 30 and m=30. Such sample sizes are 
very COmmon in practice. The following table gives the total time that would be 
required to enumerate all combinations of the form ("7") using this device. For 


Simplicity, only equal sample sizes are included (n = m): 


TABLE 2 
COMPUTATION TIMES 


norm Approximate Time Requirement 
S 00025 seconds 
10 .18. seconds 
IS 155. seconds 
20 38.3 hours 
as 4.01 vears. 
30 By ecenturies 


' 


Even if the number of combinations could be reduced by a factor of 100 through 
careful enumeration or backtrack search as mentioned before, the time requirements 
would remain virtually untenable. Further, if a new computing device were installed 
that performed the calculations 1000 times faster, our ability to process the data sets 
would be increased only marginally. 

The examples above demonstrate that the direct method of performing the 
randomization test for two independent samples is not efficient in any reasonable sense 
of the word. As sample sizes increase, the inefficiency of the method makes it 
unsuitable for practical use. In the next section, it is shown that xo efficient algorithm 
is hkely to exist for performing this test. To define what is meant by an efficient 


algorithm, some ideas from the theory of NP-completeness are introduced. 
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B. ALGORITHMIC EFFICIENCY AND NP-COMPLETENESS 
1. Basic Concepts and Terminology 
To begin a discussion of algorithmic efficiency, several basic terms must be 
defined. An excellent treatment of the subject is given in Garey and Johnson [Ref. 9], 
and the terminology used there is adopted herein. Following Garey and Johnson, an 


algorithm is a step-by-step procedure used to solve a problem. A problem is 


...a general question to be answered, usually possessing several parameters, or 
free variables, whose values are left unspecified. A problem is described by giving: 
(1) a general description of all its parameters, and (2) a statement of what 
properties the answer, or solution, 1s required to satisfy. An instance of a problem 
is obtained by specifying particular values for all the problem parameters. ... An 
algorithm is said to solve a problem ITI if that algorithm can be applied to any 
instance 7 of [I and is guaranteed always to produce a solution for that instance 
ig 


To show the use of the above terminology, consider two classic problems from 
graph theory. The first 1s due to the 19th century mathematician William Rowan 
Hamilton. The problem is to decide if an arbitrary graph consisting of a collection of 
vertices and edges has a path that passes through each vertex exactly once. Such a 
path, if it exists, is known as a Hamiltonian path. The parameters of this problem 
consist of a finite set V = vin Vo7 Ve) Gr vertices anced) sett mc acomne e;} of 
edges between pairs of vertices. A solution is an ordering SCY V2) - Vk)? of 
the vertices such that (VinV(; Se 1)) € E for 1Si<k and each vertex 1s visited exactly 
once. An instance of the problem would be obtained by giving specific vertices and 
edges (referenced to a coordinate system, for example). 

The second problem, due to Euler, is very similar to Hamuilton’s problem. It 
can be stated using the same sets as above, except that in this case, a path is sought 
which traverses each edge in the graph exactly once. Such a path 1s called an Eulerian 
path. Both Hamulton’s problem and Euler’s problem can be solved by exhaustive 
tabulation of all possible paths, checking each one to see if it has the required 
properties. This approach has the same problems as complete enumeration of 
combinations in the randomization test. The number of possible paths grows in a 
similar fashion, and the algorithm quickly becomes too inefficient for practical use. 

The important distinction between these two graph theoretic problems is that 
there is a much easier way to solve Euler’s problem than exhaustive tabulation. Euler 


showed that a path traversing each edge of a graph exactly once must exist if the graph 


18 


meets two conditions: (1) the graph must be connected and (2) there must be an even 
number of edges that meet at anv vertex, with the exception of the starting and 
finishing points of the path. The computation time required to check this is related to 
the number of vertices and edges, not the number of possible paths. An algorithm 
using this approach is practical even when the number of vertices and edges is very 
large, despite the fact that the number of possible paths may be astronomical. In the 
case of Hamilton’s problem, however, no such simple and efficient method of solution 
has ever been found. As discussed by Lewis and Papadimitriou [Ref. 10: p. 102], the 
most efficient methods available today are fundamentally no better than exhaustive 
tabulation. 

An algorithm that operates ‘efficiently’ could be viewed as one that uses a 
minimum amount of computer resources to arrive at the solution to a problem. 
Computer resources include things such as memory space, CPU time, and I/O 
(Input/Output) capacity. However, since the critical resource is usually time, the ‘most 
efficient’ algorithm is normally the fastest one. The time requirements of an algorithm 
can be expressed in terms of a single variable, the ‘size’ of a problem instance. 
Informally, this can be thought of as the amount of data that must be input to 
describe a given instance. Examples would be the number of vertices and edges in 
Hamilton’s problem or the number of X and Y observations in the randomization test. 
The formal way to characterize problem size views the situation from the standpoint of 
actual entry into a computing device. Problems must be input in a single finite string of 
symbols chosen from a fixed set, or input alphabet. An encoding scheme must be 
specified, which maps problem instances into the symbolic strings describing them. The 
input length for an instance of a problem is the number of symbols required to specify 
the instance under the given encoding scheme. As indicated in Garey and Johnson 
[Ref. 9: pp.5-6], the input length is what is used as the formal measure of instance size. 

The time complexity function for an algorithm expresses its time requirements 
by giving, for each possible input length, the largest amount of time needed by the 
algorithm to solve a problem instance of that size. This function won't be well defined 
unless a particular computing device, input alphabet and encoding scheme are 
specified. However, it turns out that these are relatively unimportant factors. What is 
important is the form of the time complexity function. The following discussion from 


Garey and Johnson [Ref. 9: p.6] introduces this idea: 
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Different algorithms possess a wide variety of different time complexity functions, 
and the characterization of which of these are ‘efficient enough’ and which are 
‘too inefficient’ will always depend on the situation at hand. However, computer 
scientists recognize a simple distinction that offers considerable insight into these 
matters. This is the distinction between polynomial time algorithms and 
exponential time algorithms. 


2. Polynomial Time and Exponential Time Algorithms 
A polynomial time algorithm is defined to be one whose time complexity 


function can be bounded by a polynomial. That is, there exists a constant c such that 


Jf(N)] = ¢ |pQn)| (eqn 3.1) 


for all values of N20, where f(N) is the time complexity function, p(N) is a 
polynomial function of N, and N is the input length. An algorithm whose time 
complexity function cannot be so bounded by any finite degree polynomial is called an 
exponential time algorithm [Ref. 9: p.6]. 

The distinction between these two types of algorithms becomes important 
when the input lengths become large. Polynomial functions of degree k will evaluate to 
be of the order NX, but exponential functions are allowed to have terms such as 2 or 
N!. There is always a value of N beyond which exponential functions grow at a faster 
rate than any polynomial function, even if the polynomial is of degree 100. It is for 
this reason that polynomial time algorithms are generally regarded as being much more 
desirable than exponential time algorithms. There are some notable exceptions, 
however. As mentioned in Garey and Johnson [Ref. 9: p.9], the simplex algorithm for 
linear programming has been shown to have exponential time complexity, but it 
typically runs very quickly in practice. Garey and Johnson [Ref. 9: p.8}] also observe 
that “time complexity as defined is a worst case measure, and the fact that an algorithm 
has time complexity 2" means only that at least one problem instance of size n requires 
that much time.” Examples of exponential time algorithms that run well in practice are 
rare. Most exponential time algorithms are variations on exhaustive search or complete 
enumeration, while polynomial time algorithms generally exploit some fundamental 
structure of a problem. 

3. The Classes P and NP 
Problems for which only exponential time algorithms exist are intractable, ina 


sense, because even fairly small instances may never be solved in a realistic amount of 
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time. For those problems that have polynomial time algorithms, the polynomials 
involved typically are not of a high order, and thus instances of practically any size can 
be solved. It would be convenient if all problems of interest could be placed into two 
groups, those having exponential time complexity and those having polynomial time 
complexity. Unfortunately, it is exceedingly difficult to prove that a given problem is 
intractable; that is, no polynomial time algorithm can ever be devised to solve it. For a 
small number of problems it has been shown that exponential time algorithms are the 
only ones possible, but for most practical problems of interest, this has not been done. 

Those problems for which polynomial time algorithms are known to exist are 
in a class denoted P. Euler’s problem is a member of P. In between this class and the 
class of provably intractable problems is another class, denoted NP. Formal 
definitions of these classes usually involve models of computation known as Turing 
machines. However, to gain an understanding of the class NP, the concepts of 
nondeterministic computation and polynomial time verifiability are most important. 

A deterministic algorithm can be thought of as being composed of a 
predetermined sequence of operations that do not vary each time the algorithm is used. 
A nondeterministic algorithm introduces the possibility of randomness at points within 
the procedure. A convenient way to view the operation of such an algorithm is to think 
of it as being composed of two separate stages, the first being a guessing stage and the 
second a checking stage. Given a problem instance, the first stage guesses some 
Structure. [The second stage checks this structure in a deterministic fashion to see if it is 
a solution to the problem. A nondeterministic algorithm is said to operate in 
polynomial time if there exists some guessed structure that solves the problem and this 
structure can be verified by the checking stage in polynomial time [Ref. 9: pp.28-29]. 

The class NP is defined informally to be the class of all decision problems that 
can be ‘solved’ by polynomial time nondeterministic algorithms [Ref. 9: p.29]. A 
decision problem is one that has only a yes or no answer; for example, “Does this 
graph have a Hamiltonian path?”. Most problems of interest can be carefully phrased 
as decision problems, so this is not overly restrictive. A nondeterministic algorithm 
would ‘solve’ Hamilton’s problem in the following way: (1) an arbitrary path through 
the graph would be guessed, and (2) the path would be examined to see if it passes 
through each vertex exactly once. If the graph does have a Hamiltonian path, then one 
of the guesses will lead the algorithm to respond ‘yes’, thus solving the problem. 
Hamilton’s problem is known to be a member of the class NP; this implies that step 


(2) above can be performed in polynomial time. 
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It is very important to note that the word ‘solve’ as used above does not mean 
that a nondeterministic algorithm is a realistic method for solving decision problems. 
This is a only theoretical concept. In fact, a hypothetical machine using a 
nondeterministic algorithm is envisioned as having the ability to pursue an unbounded 
number of independent computational sequences in parallel. Thus, in Hamilton’s 
problem, the fact that there may be an exponential number of possible paths to check 
is not counted. It is only required that. given a path, it can be checked in polynomial 
time. It is this notion of polynomial time verifiability that the class NP is intended to 
capture. Most importantly, as Garey and Johnson [Ref. 9: p.12, pp.28-29] point out, 
polynomial time verifiability does not imply polynomial time solvability. 

4. NP Complete Problems 

A simplistic way to view the class NP is to think of it as containing ‘hard’ 
problems: those for which polynomial time algorithms are not known, but neither can 
it be proved that none exist. The problems in this class also share the important 
property that any one solution arrived at by ‘guessing’ can be quickly checked, even 
though there may be exponentially many guesses possible. The class P contains ‘easy’ 
problems in the sense that polynomial time algorithms are known for them. 

The relationship between P and NP is fundamental to discussions of 
algorithmic efficiency. It can easily be shown that PGNP. Following Garey and 
Johnson {Ret 9 ap-o 2 


Every decision problem solvable by a polynomial time deterministic algorithm is 
also solvable by a polynomial time nondeterministic algorithm. To see this, one 
simply needs to observe that any deterministic algorithm can be used as the 
checking stage of a nondeterministic algorithm. If IIeP, and A is any polynomial 
time deterministic algorithm for JI, we can obtain a polynomial time 
nondeterministic algorithm for II merely by using A as the checking stage and 
ignoring the guess. Thus [IEP implies IIENP. 


It is widely believed that the inclusion is proper, that is, P&NP but P=NP. This has 
not been proven, but all evidence seems to strongly suggest this 1s the case. This is of 
prime importance, because if P differs from NP, then the set NP —-P would not be 
empty - it would contain intractable problems. 

Another concept central to the discussion of algorithmic efficiency is that of 
problems of equivalent difficulty. If several problems can be shown to be related, or of 
equivalent difficulty, then results of considerable generality and power can be obtained. 


Referring again to Garey and Johnson [Ref. 9: p.13]: 


The principal technique used for demonstrating that two problems are related is 
that of ‘reducing’ one to the other, by giving a constructive transformation that 
maps any instance of the first problem into an equivalent instance of the second. 
Such a transformation provides the means for converting any algonthm that 
solves the second problem into a corresponding algorithm for solving the first 
problem. 


The important characterization here is polynomial time reducibility, that is, reductions 
for which the required transformation can be executed by a polynomial time algorithm. 
If one problem can be reduced to another through a polynomial time reduction, this 
ensures that any polynomial time algorithm for the second problem can be converted 
into a corresponding polynomial time algorithm for the first problem. 

There 1s a subclass of problems within NP that has an important property: 
every problem in NP can be polynomially reduced to one of the problems in this 
subclass. The problems in this subclass are named NP-complete problems. The 
implications of this subclass are far-reaching. If any one of the NP-complete problems 
can be solved with a polynomial ume algorithm, then so can every problem in NP. 
Also, if any problem in NP is intractable, then all the NP-complete problems must be 
intractable. In a sense, the NP-complete problems are the ‘hardest’ problems in NP. 
A picture representing the relationships between the classes of problems discussed so 


far is given in Figure 3.1. 


Provably 


Intractable NP 


Problems complete 


NP 





Figure 3.1 Relationships Between Classes of Problems. 
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Hundreds of problems have been shown to be NP-complete since the first 
such problem was identified by Stephen Cook in 1971 (the ‘satisfiability’ problem of 
Boolean logic) [Ref. 9: p.13, p.38]. The list of NP-complete problems includes 
Hamilton’s problem, many well known combinatorial problems, and others from a 
wide variety of disciplines. As more and more problems are added to the list, it appears 
more and more likely that P# NP and the NP-complete problems are truly intractable, 
but little progress has been made toward either a proof or a disproof of this conjecture. 
As Garey and Johnson conclude [Ref. 9: p.14], even without such a proof, the 
knowledge that a problem is NP-complete suggests, at the very least, that a major 
breakthrough will be needed to solve it with a polynomial time algorithm. 

5. #P-Complete Problems 

So far the discussion of NP-completeness has centered around decision 
problems with yes-no answers. In many cases, however, the real question to be 
answered goes beyond simply whether a solution exists or not (yes or no). It may be 
important to find out how many solutions there are. Then the problem becomes an 
enumeration problem. For example, associated with the NP-complete decision problem 
‘Does this graph have a Hamiltonian path?’ is the enumeration problem ‘How many 
distinct Hamiltonian paths are there in this graph?’ 

According to Garey and Johnson [Ref. 9: p.167], “Enumeration problems 
provide natural candidates for the type of problem that might be intractable even if 
P=NP.” Even if the basic decision problem could be solved in polynomial time, it is 
not at all clear that the number of distinct solutions could be determined in polynomial 
time. Note that enumeration problems do not require all the solutions to be displayed, 
only counted. Thus the number of Hamiltonian paths in a graph may be exponentially 
large and an exponential amount of time would be required to list them all, but the 
answer to the enumeration problem is just a single number. Some enumeration 
problems can be solved in polynomial time. For example, the question ‘Given a graph 
G, how many Eulerian paths are there for G?’ can be solved with a polynomial time 
algorithm, like the basic decision problem. However, some enumeration problems do 
not appear to be solvable in polynomial time even though the associated decision 
problem can be solved in polynomial time. 

To encompass these considerations, the ideas behind NP-complete problems 
can be extended. A new class, designated #P-complete can be used to categorize 


enumeration problems. This is intended to capture the additional difficulty of 
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enumerating solutions, not just their existence. This class is defined in a way analogous 
to the class of NP-complete problems. Many of the enumeration problems associated 
with basic NP-complete problems are #P-complete. What is interesting is that some 
enumeration problems are now known to be #P-complete even though their associated 
decision problems are not known to be members of NP [Ref. 9: pp.168-170]. 

The significance of all this is that if a practical problem can be shown to be 
#P-complete, the search for an efficient, exact algorithm that solves the problem might 
never be productive. This is not to say that one never will be found, but rather that a 
major breakthrough will be required. And if an algorithm is ever discovered that solves 
the problem in polynomial time, the implications will be very far reaching. Even though 
the members of the class of #P-complete problems are not yet provabdly intractable, it 
seems reasonable to operate on the assumption that they are. With that in mind, the 


investigation of approximation algorithms is certainly of practical significance. 


C. EFFICIENCY OF ALGORITHMS FOR PERFORMING - THE 
RANDOMIZATION TEST 


1. Randomization Test is an Enumeration problem 

Performing the randomization test for two independent samples to accomplish 
a significance test is an enumeration problem. The remainder of this chapter is devoted 
to showing that it is #P-complete. The fact that it is an enumeration problem is seen 
bv considering the structure of the test. The significance level @ is obtained by counting 
subsets of size m out of n+m elements such that the sum of the elements in each subset 
is equal to or more extreme than the fixed value IT. Analogous to the problems 
discussed in the last section, an associated decision problem could be stated “For some 
fixed number A, are there K or more subsets of size m for which the sum of the 
elements is equal to or more extreme than T,?” This could be answered yes or no if a 
value of K were specified beforehand. This would effectively correspond to using the 
randomization procedure to perform a Aypothesis test, because the value of K could be 
determined from the desired value of d, using the relation @)= K / (")’"). In the case 
of significance testing, there is no pre-specified value @,. To calculate the significance 
level, we need to know how many subsets have sums equal to or more extreme than 
T,. In this case, the randomization procedure becomes an enumeration problem rather 


than a decision problem. The implications of this are discussed in the last chapter. 


PAS, 


2. Significance Testing is #P-Complete 

To show that using the randomization procedure to perform a significance test 
is #P-complete, two steps are required. First, an enumeration problem is introduced 
which is known to be #P-complete. Second, performing the randomization procedure is 
shown to be of equivalent difficulty to this problem. The #P-complete problem is termed 
the K’" LARGEST SUBSET problem and is described next. 

The K"" LARGEST SUBSET problem can be stated succinctly using the 
terminology and format of Garey and Johnson [Ref. 9: p.114]: 

Problem Instance: Given a finite set A, a size s(a) € Z* for eachae A, and two 
nonnegative integers B <) s(a) and K S 2IAL, 

Question: Are there K or more distinct subsets A’ © A for which the sum of the sizes 
of the elements in A’ does not exceed B? 

The notation |A| 1s defined as the number of elements in the set A. It is not yet known 

if this decision problem is in the class NP, but it is known that the corresponding 

enumeration problem is #P-complete. 

Performing the randomization test for two independent samples (assuming 
significance testing) can be described using the same kinds of set theoretic objects as 
are used in the KY LARGEST SUBSET problem above. This can be done in the 
following way. Let A be the set of nt+m elements consisting of the X and Y 
observations taken together; that is, A = {X,, X,, oom es nee XZ ce ele Let the 
size s(a) be the positive integer representation of each element of A. This does not 
restrict applicability of this result to positive integer observations, however. To show 
why, consider the following. Note that the test statistic being used T=)X, is just the 
sum of m elements selected from the set A. Suppose some of the elements of A are 
negative. Then choose a positive constant g such that when q 1s added to every 
member of the set A, all the elements will become positive numbers. Every value of the 
test statistic will also be increased by a constant value, namelv ng. This has the effect 
of shifting the randomization distribution by a fixed amount, and it is obvious that the 
counting process used to determine the significance level of the test is unaffected. 

The next question that might be asked is, what if the elements of A (the X and 
Y observations) are real numbers? If the elements in A are the observations from 
some actual experiment, then any measuring device used can only produce results 
accurate to within some fixed number of decimal places. Therefore, even though the set 


of possible measurements is theoretically a subset of the real numbers, it in actuality 
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can only be a collection of integer values over some range; the decimal point is 
immaterial. Even if real numbers could actually be obtained from some experiment, 
thev would still have to be represented internally in any physical computing device by a 
fixed number of bits. Again, the position of the decimal point is immaterial; the set of 
values that can actually be represented is restricted to some collection of integers. 

The implication of the preceding paragraphs is that any real experimental data 
can be thought of as positive integer valued if computing devices are used to perform 
statistical tests. Thus, any results about algorithmic efficiency stated in terms of 
positive integers apply whether the ‘true’ observations are real numbers, integers, or 
negative values. 

Next, let the number B equal the value of the originally observed test statistic 
T,. Let K be the number of test statistic values equal to or more extreme than T). 
Then the enumeration problem associated with the question ‘Are there K or more 
distinct subsets A’ & A for which the sum of the sizes of the elements does not exceed 
B?’ is almost equivalent to performing the two sample randomization test. Note that 
the above question specifies K or more distinct subsets A’ © A. This includes all 
subsets of A, regardless of how many elements they contain. The number of such 
subsets is 277’ since the number of elements in 4 is nt+m. In the randomization 
test, though, we are interested in counting only those subsets with a fixed number of 
elements, namely n. This is equivalent to enumerating all instances where the test 
statistic value is equal to or more extreme than To: since the test statistic is formed by 
subsets of size n. 

Restricting the enumeration problem to subsets of size n is also #P-complete. 
This can be shown as follows. Suppose we had available an algorithm which could 
enumerate the number of subsets of size i for which the sum of the sizes of the 
elements does not exceed B for any fixed value of i such that OSiSn+m. Note that 
by selecting ‘=n, this algorithm would perform the enumeration required for the 
randomization test. Suppose further that this algorithm operated in polynomial time, 
that is, in trme bounded by a polynomial in N = n+m. Then, by simply incrementing 
i sequentially from 0 to n+m and using this algorithm repeatedly, we could count all 
the distinct subsets A’ © A for which the sum of the sizes of the elements does not 


exceed B. This is true because of the relationship 


y (*) = 2N (eqn 3.2) 


where N = nt+tm. In other words, we could solve the K"" LARGEST SUBSET 
problem by using our algorithm N+1 times. This would mean the time required to 
enumerate all the subsets A’ © A for which the sum of the sizes of the elements does 
not exceed B would be bounded by a function of the form (N+ 1)p(N) - but this is 
easily seen to be another polynomial. This is a contradiction, because the K™ 
LARGEST SUBSET problem is #P-complete, and its solutions cannot be enumerated 
in polynomial time. Therefore, any enumeration algorithm that only counts subsets of 


fixed size n is also #P-complete. 
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IV. ANALYSIS OF AN APPROXIMATE RANDOMIZATION TEST 
PROCEDURE 


A. INTRODUCTION 
1. Reasons For Using An Approximate Method 

In the previous chapter, it was shown that performing the randomization test 
for two independent samples is computationally a #P-complete enumeration problem 
when the method 1s used for significance testing. This means that a fast and efficient 
way to perform the test is not likely to exist. Certainly one is not Known at the present 
tume. In practical terms, the amount of computer time required to complete the 
necessary calculations becomes totally unreasonable for large data sets. Therefore, if 
the randomization test 1s to be used regularly, some way must be found to obtain 
approximate results that are almost as good as the exact results but don’t require 
anywhere near as much computation time. 

2. Considerations When Using Approximations 

There are many approaches one could take in devising an approximate 
randomization test. The idea is to come up with a method that yields significance levels 
very close to those that would result if the exact test were used on the same data. The 
method should give good results over a wide range of conditions and it should require 
only a modest amount of computer time. Ideally, it should be possible to establish 
bounds on the errors involved with using the approximation. These bounds should 
result from an analytic investigation of both the approximation and the exact test. 

Unfortunately, in the case of the randomization test, analytic results are hard 
to come bv. When a randomization test 1s used, the test statistic can take on as many 
as ra) values. The distribution of these values is called the randomization distribution. 
It is important to note that this 1s a conditional distribution. That 1s, it is formed by 
using the given observations. Therefore, this distribution changes every time a set of 
observations 1s taken. It can easily be shown that the randomization distribution 
asymptotically approaches one of the standard distributions, such as normal or chi- 
Square, but the use of the asymptotic distribution as an approximation may not be 
accurate in some cases. As Conover [Ref. 4: p.327] indicates, when the observations 
change from one sample to the next, it 1s impossible to measure the accuracy of any 


asymptotic distribution. 


Another problem with developing analytic results is that the underlying 
distributions of the X and Y populations are not required to be of some specific form. 
It might be possible to derive error bounds on a conditional basis. That is, by stating 
something like “Jf the underlying distributions are of the forms F(x) and G(y) , then the 
maximum error incurred by using this approximation is H(x, y).’ Of course, the number 
of possible distributions is infinite, and the ‘true’ underlying distributions can never be 
known with certainty, so this approach may have limited value. 

3. Method Studied 

There are several ways that have been used to perform approximate 
randomization tests. One way is to simply use the standard t-test, even though the data 
may not be normally distributed, and then hope that the results are not too far off. 
Other methods have involved using only portions of the data, sampling from the total 
number of combinations, and fitting various distributions. Some of these methods are 
briefly described in the next section. The method studied here with the aid of 
simulation is the 2-moment fit method. Significance levels obtained with this method 
are compared to those obtained from the exact randomization test and the t-test. 


Power curves for each test are also generated as a separate indicator of performance. 


B. PERFORMING APPROXIMATE RANDOMIZATION TESTS 
I. Subsampling 

One way to perform an approximate randomization test is to determine the 
significance level from a subset of the test statistic values making up the randomization 
distribution. The subset consists of combinations chosen at random from the Ge) 
combinations possible. The test statistic values are computed for these combinations 
only and an approximate significance level is obtained. This is called subsampling and 
the combinations can be selected through random sampling with replacement or 
without replacement. For example, if an experiment yielded 30 X observations and 30 
Y observations (n= m= 30), the total number of test statistic values making up the 
randomization distribution would be iC ), which is about 1.18 x 10/7. Instead of 
comparing all those combinations to the original test statistic value T), a much smaller 
set of test statistic values, say a few thousand, could be formed from combinations 
selected at random out of the ( oY available. This smaller number of test statistic 
values could then be compared to T, and an approximate significance level could be 


found. 
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Subsampling is a very attractive approximation method, since even a few 
thousand test statistic values can be generated at random and compared rather quickly. 
The method has intuitive appeal also, because every combination out of the (777) 
possible 1s considered equally likely if the null hypothesis is true. Sampling from a set 
of equally likely objects should yield representative subsets. The only questions to be 
answered are how large a sample is required and whether sampling should be done with 
or without replacement. Studies done on this subject [Ref 11: pp.43-45] make it 
appear that sampling with repiacement is acceptable and the use of sample sizes as 
small as 1000 can provide good results. 

2. Blocks 

Another approximation method, which ts a variation on the subsampling 
scheme, is the use of blocks. This method can be applied to the randomization test for 
two independent samples in the following way, which is described by Boyett and 
Shuster [Ref. 12: p.666]. Within the X and the Y samples, an appropriate number of 
blocks 1s formed by random allocation, each block having the same number of 
observations. Then an exact randomization test is used on the block sums. For 
example, if the data consisted of 30 X observations and 30 Y observations, six blocks 
of five observations each could be randomly formed within the X’s and the Y’s. The 
sum of the observations in each block would be found. Then the exact randomization 
procedure could be used on the block sums. The number of all such sums would only 
be ( : ) = 924. Again, significant savings could be achieved over performing the exact 
test on all 1.18 x 10!’ combinations of the observations without blocking. 

How many blocks should be chosen depends on how accurate the results need 
to be and on how much computation time is considered acceptable. Using a small 
number of blocks mav be less accurate, but the computer time required will certainly 
be less than if many blocks are used. It should also be noted that it may not be 
possible to form a convenient number of blocks (all containing the same number of 
observations) without discarding some of the data. For example, if we had 23 X 
observations and 26 Y observations, we might form 7 blocks of 3 observations each 
within the X’s and 8 blocks of 3 observations each within the Y’s. In this case, two X 
and two Y observations would have to be discarded. 

3. T-test as an Approximation 
If random sampling from normal distributions can be assumed, the standard 


two sample t-test is the appropriate parametric procedure that can be used to perform 
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a comparison of means. However, even if the underlying distributions are not normal, a 
histogram of the test statistic values from the randomization procedure often resembles 
a bell-shaped normal density. This is true for the test statistic being used here, namely 
T=> X.. An equivalent test statistic that yields the same results is 
T = ()-X,)/n = (DY, )/mn . If this test statistic is used, a histogram of the test statistic 
values is centered at the origin and takes on the appearance of a central t density. In 
fact, the randomization distribution arising from the use of this test statistic is usually 
approximated reasonably well by an appropriately scaled t distribution. Hence, as Box, 
Hunter and Hunter [Ref. 2: pp.95-97] observe, provided that a randomized experiment is 
performed, t-tests can be used as approximations to exact randomization tests even if 
the underlying distributions are not normal. 
4. 2-Moment Fit Method 

The next approximation method to be discussed will be called the 2-moment fit 
method. The basic principle involved is simplv that of using a continuous distribution 
to approximate a discrete distribution. As mentioned in the last section, if histograms 


of the true randomization distribution are examined, it becomes apparent that in many 
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Figure 4.1 Typical Randomization Histogram. 


cases the histograms seem to have a characteristic bell shape as in Figure 4.1. In fact, 
if the null hypothesis is true, the distribution of the randomization test statistic should 
asymptotically approach a normal distribution under easily met conditions 
[Ref. 4: p.327]. With this in mind, it seems reasonable to assume that a normal 
distribution with some mean pt and standard deviation o might be fitred to the 


randomization distribution, as shown in Figure 4.2. 
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Figure 4.2 Normal Density Fitted to Randomization Histogram. 


If the normal density ‘fits’ reasonably well, the area under a given portion of 
the curve should approximate the corresponding area under the randomization 
histogram bars. The area represented by the histogram bars equal to or more extreme 


than the originally observed test statistic value I, corresponds to the significance level 
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Figure 4.3. Tail Areas Correspond to a. 


a of the test. This is shown in Figure 4.3. Therefore, the laborious exact calculation of 
a@ by enumeration can be replaced by fitting an appropriate normal curve and using 
tables to find the required areas. The approximate a@ obtained in this manner could be 
very quickly calculated, no matter how large the number of test statistic combinations. 
To ‘fit’ a normal distribution to the distribution of test statistic values, the 
first two moments (functions of ft and o) of the normal distribution must be related to 
two values in the test statistic distribution, hence the name ‘2-moment fit’. Two values 


that could be chosen are simply the end points - that is, the smallest and the largest 
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values that the test statistic takes on when the randomization test is actually 
performed. It is easy to find these two values without enumerating all possible 
combinations; just sum the » smallest and then the nm largest observations from the 
combined set {X,Y}. Once the smallest and largest test statistic values are found, the 
probabilities associated with their occurrence are easily determined from knowledge of 
the total number of test statistics possible, which is (""7’"). These probabilities are used 
to find the # and o that completely describe the fitted normal density. For a full 
derivation of the equations involved with this method, see Appendix A. 

The 2-moment fit method seems intuitively appealing. As the number of test 
Statistic values that make up the randomization distribution gets large, it seems 
reasonable to expect that a continuous function (the normal distribution) should more 
closely approximate the true discrete distribution. The degree to which the 
approximation yields values ‘close’ to the true @ also depends on the degree to which 


the normal curve follows the shape of the true discrete distribution. 


C. A COMPARISON OF EXACT AND APPROXIMATE METHODS USING 
SIMULATION 


1. Purpose of Simulation 

The inherent difficulties associated with deriving analytic results describing 
error bounds have been discussed previously. Because of these difficulties, the errors 
that result from the use of approximate methods can be studied conveniently using 
simulation techniques. After a simulation has been run several times, approximate 
error bounds can be established and confidence limits on those bounds can be applied. 
A variety of input conditions and underlying distributions can be entered, and the 
effect of each can be analyzed. 

A simulation was written for the sole purpose of comparing the significance 
level and power of the exact randomization test under varying conditions to two 
alternative methods: 

(1) The 2-moment fit approximation 
(2) The two-sample t-test. 
A complete description of the simulation and an interpretation of the major results 


obtained from it are the subjects of the remainder of this chapter. 
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2. Description of Simulation 
a. Overall Structure 
The overall structure of the simulation can be outlined as follows. The 
purpose is to compare the power and significance level of the exact randomization test 
to the 2-moment fit approximate method and the standard t-test. This is accomplished 
by repeatedly generating sets of X and Y observations from preselected distributions. 
The parameters of the X and Y distributions can be independently varied. At each 


repetition, a significance test is performed on the hypothesis 


Ho My = My 
vs. Hy): ph, = HY 


using each of the three methods. The results are recorded in a file for analysis by 
separate means. To generate power curves for each test method, the parameters of the 
X distribution are held fixed while the mean of the Y distribution is varied over a 
specified range. Using a pre-selected value denoted @), the probability of rejecting the 
null hypothesis when it is false (the definition of power) 1s empirically determined for 
each difference MW, — IM, in the specified range. 
The following basic distributions can be selected for the X and Y samples: 

(1) Normal 

(2) Exponential 

(3) Umiform 
All input parameters can be varied. These include: 

(a) Mean of X and Y distributions (all types) 

(b) Standard deviation of X and Y distributions (normal) 

(c) Sample sizes n and m 

(d) Number of repetitions 

(e) Range over which power curves are to be generated 


(f) Value of dG) to use in obtaining power values. 
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b. Programming Details 

The simulation was programmed in VS FORTRAN. Routines in both the 
IMSL and NON-IMSL hbraries were utilized for random number generation and 
calculation of values associated with the normal and t distributions. A complete 
program listing 1s provided in Appendix B. 

3. Results and Interpretation 
a. Significance Levels 

The first studies conducted with the simulation were those in which the 
significance levels yielded by each of the three methods were compared. These 
comparisons were made by generating m X observations and m Y observations from 
the same type of distribution, except that MW, and HY could each be varied. Once a set of 
X and Y observations was generated, all three tests (exact randomization, 2-moment fit 
approximation, and the t-test) were performed on that set and the three resulting 
significance levels were recorded. This process was repeated a selectable number of 
times. The following input conditions were varied: 

(1) Distribution type (Normal, Exponential, and Uniform) 

(2) sare Hy) and H, false (u, # Hy) 

(3) Sample sizes n and m for X and Y sets, respectively 

(4) Number of repetitions 
Sample sizes up to nm=11 and m=11 were examined, and up to 200 repetitions were 
used. For each input condition, the significance levels from the 2-moment fit method 
and the t-test were plotted against the corresponding values obtained from the exact 
randomization test. 

The plotted data from these simulation runs indicated that for all sample 
sizes larger than n=4 and m=4, the 2-moment fit method generally produced smaller 
significance values than either the exact randomization test or the t-test, with a 
maximum average error of about 0.2 units of probability. The significance values 
obtained from the t-test were much closer to those from the exact randomization test. 
This behavior was observed for all three distributions and for every combination of 
input parameters. Example plots appear in Figures 4.4 and 4.5. 

For simulated sample sizes less than m=4 and m=4, the fact that the 
randomization test can only produce a discrete set of significance values tended to 
introduce more variability in the results. It was also noticed that the 2-moment fit 
method yielded essentially the same values as the other two methods when the 


significance levels were close to either 0 or 1. 
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b. Power Curves 

To develop power curves, the simulation was run in a manner simular to the 
significance testing situation. The same input conditions were varied, except that a 
value of @, was specified beforehand. Each time a computed significance level occurred 
that was less than @p, the test that produced that level was counted as having rejected 
the null hypothesis Hj. The number of times each test rejected Hy was then divided by 
the number of repetitions to yield average power values. This was repeated for a range 
of f.. values around a fixed value of pt,. 

: Examples of power curves generated from the simulation appear in Figure 
4.6. It appears from the power curves that the 2-moment fit method rejected the null 
hypothesis too often. That is, the power curves for this method were artificially high. 
This follows from the fact that the 2-moment fit method generally yielded significance 
values that were too low. When the null hypothesis is true (4, = Hl), the power curve 
should pass through the selected value of Gy. This was not the case for the 2-moment 
fit curves; thev were consistently too high. 

The power of the t-test was close to the power of the exact randomization 
test for runs involving the normal and the uniform distributions. However, for the 
exponential distribution, the exact randomization test power curve was always above 
the power curve for the t-test. This is consistent with the theoretical results discussed in 
Chapter Two - namely that the randomization test is the uniformly most powerful test 
against the subclass of alternatives that includes the exponential densities. 

c. Randomization Histograms 

Randomization distribution histograms were plotted for many of the input 
conditions used in the simulation. Most of these were unimodal in appearance, as 
expected. However, some of the histograms resulting from runs involving the 
exponential distribution were multimodal when the null hypothesis was false. For two 
examples, see Figure 4.7. This behavior could help explain why the 2-moment fit 
method does not approximate the true significance level very well in these cases. The 
2-moment fit method tries to fit a (unimodal) normal density to the test statistic values, 
and if those values exhibit multimodal tendencies, large errors are likely. 

4. Summary of Results 
The most significant results obtained from the simulation are (1) the t-test 1s a 
good approximation to the exact randomization test in most cases, and (2) the 


2-moment fit method usually yields smaller significance values than either the 
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Figure 4.7 Multimodal Histograms. 


randomization test or the t-test. The maximum average error incurred is about 0.2 
units of probabilitv. The reason the 2-moment fit method does not work very well 1s 
probably related to its use of the two most extreme values of the test statistic. Finally, 
the power curves that were generated showed that the randomization test can be more 


powerful than the t-test when samples are exponentiallv distributed. 
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V. SUMMARY 


A. MAJOR RESULTS AND CONCLUSIONS 

This thesis has addressed issues related to the practical implementation of the 
randomization test for two independent samples. The test was described as a method 
for comparing the means of two populations X and Y from which independent samples 
have been drawn. The method can be categorized as a nonparametric statistical 
procedure because assumptions about the specific form of the X and Y distributions 
and associated parameters are not necessary. 

Although many nonparametric procedures are ‘weaker’ than corresponding 
parametric techniques, the randomization test 1s at least as good from a theoretical 
standpoint as its parametric counterpart, the t-test. In some cases, 1ts performance can 
be better. Some of the indicators of a good statistical test are efficiency, unbiasedness 
and power. The randomization test has been shown to have an asymptotic relative 
efficiency of 1.0, it 1s an unbiased test, and it is the uniformly most powerful test in 
certain situations. Each of these results obtained from the literature was discussed. The 
implication 1s that the randomization test should be the preferred method of testing 
equality of means unless reasonable justification exists for the use of the t-test 
(normality assumptions can be supported, for example). 

Even though the randomization test may be the best way to compare population 
means in theory, it can be so time-consuming to actually perform the test on a 
computer that it is not often used unless sample sizes are relatively small. The structure 
of the test is basically a counting procedure involving combinations of the X and Y 
observations. As the number of observations increases, the number of possible 
combinations becomes so huge that even the fastest computing machinery cannot 
perform the test in a realistic amount of time. There is no known way to perform the 
test efficiently for large sample sizes in the general case. 

To be more specific about what is computationally efficient and what is not, 
topics from the theories of NP-complete and #P-complete problems were introduced in 
this thesis. Algorithms for performing tasks or solving problems on a computer can be 
broadly classed as efficient if they can be executed in polynomial time. If a problem can 


be classified as NP-complete or #P-complete, it is extremely unlikely that a polynomial 
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time algorithm exists which can solve it. The randomization test for two independent 
samples was shown to be a #P-complete enumeration problem when significance 
testing is being performed. Therefore, an efficient algorithm for implementing the test 
on a computer is not likely to exist. 

Because of the problem of excessive computation time, ways have been sought to 
perform the randomization test approximately; that is, to obtain significance values 
close to those that would result if the exact test were used on the same data. Some of 
the ways that have been suggested to perform an approximate test include 
subsampling, the use of blocks, asymptotic distributions or simply using the standard t- 
test. There are advantages and disadvantages involved with the use of each of these 
methods. 

Another way to perform an approximate test is to fit a normal distribution to the 
distribution of test statistics that would result if the exact procedure were used. This 
method extracts the largest and the smallest test statistic values and uses them to find 
the first two moments of the fitted normal distribution, hence the name 2-moment fit 
method. This method was studied with the aid of a simulation. The simulation 
compared the performance of the 2-moment fit approximation to the exact 
randomization test and the standard t-test. Significance levels were found and power 
curves were developed for each test under varying conditions. 

Several conclusions could be drawn from analyzing the simulation data. The first 
conclusion is that the 2-moment fit method will, in general, underestimate the true 
significance level that would result from using the exact randomization test. This 
behavior occurred for all conditions studied in the simulation, which included changes 
in the sample distributions, changes in location parameters, and both true and false 
null hypotheses. The maximum average error resulting from the use of the 2-moment 
fit method approximation when the null hypothesis is true is about 0.2 units of 
probability. Error in this context is defined to be the true significance level from the 
randomization test minus the approximate significance level. 

Another important conclusion is that the t-test is quite adequate as an 
approximation to the exact randomization test in most cases. Statements to this effect 
are in the literature, and the simulation results proved to be consistent. The 
significance values produced by the t-test were generally very close to those obtained 
from the exact randomization test. When power curves were developed, though, it was 


demonstrated that the exact randomization test can be more powerful than the t-test 
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when the underlying distributions are exponential. This is also consistent with 
theoretical results identifying the randomization test as the uniformly most powerful 
test in that particular situation. 

The overall conclusion of this thesis is that the randomization test for two 
independent samples should be used in its exact form for testing equality of means if 
sample sizes are small and there is concern over whether or not assumptions of 
normality can be justified. When sample sizes become large enough that performing an 
exact randomization test requires more than a reasonable amount of time, the t-test 
provides good approximate results. Of course, the t-test is always the most appropriate 


test to use in the first place if one 1s willing to assume normality actually exists. 


B. AREAS FOR FURTHER RESEARCH 
1. Approximations 

Approximate methods appear to be the most practical ways to implement 
randomization tests if they are desired for large sample sizes. More research in this area 
could be of value. It might even be possible to obtain more accuracy from the 
2-moment fit method in some way. But this research indicates that a significant 
improvement would be required before the method could be considered better than the 
t-test as an approximation. 

2. Pseudo-Polynomial Time Algorithms 

One area of research that could prove to be very significant would be the 
development of a pseudo-polynomial time algorithm to perform the randomization test 
when hypothesis testing is being done. A pseudo-polynomial time algorithm is one that 
can be executed in polynomial time if a bound on the allowable input lengths 1s 
established ahead of time. For a more detailed explanation, see Garey and Johnson 
(Ref. 9: p.91]. This would correspond to selecting upper limits for the sample sizes n 
and m and designing an algorithm based on the knowledge that larger sample sizes will 
not be input to the routine. An example of this kind of approach is the use of dynamic 
programming to solve the classic knapsack problem. 

It was shown in Chapter Three that using the randomization test to perform 
significance testing is a #P-complete enumeration problem. However, if hypothesis 
testing 1s being performed, the test is really a decision problem with a yes or no answer. 
[f maximum allowable sample sizes were to be established in advance, an approach 
similar to dynamic programming might be used to solve the problem much more 


efficiently than using total enumeration. An indication of how this could be applied 


4} 


appears in Garey and Johnson [Ref. 9: pp.90-92]. If a suitable algorithm could be 
designed that runs quickly in practice (even if it is theoretically a pseudo-polynomial 
time algorithm), an important step would be made toward more widespread use of 


randomization tests for statistical hypotheses. 
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APPENDIX A 
DERIVATION OF THE 2-MOMENT FIT METHOD 


Purpose: To obtain approximate significance values by fitting a normal distribution to 
the distribution of exact randomization test statistics. Areas under the resulting 
normal curve correspond to the proportion of test statistics equal to or more extreme 


than the originally observed value T>. 
Step 1: Find u and o that define a fitted normal density function. 


Recall that the test statistic is just the sum of the X observations: 
T = » X, 9 1 — IL eo0 Nl. 


Let n be the number of X observations and m be the number of Y observations. Let 
T, be the smallest test statistic value. This value can easily be found by summing the n 
smallest observations from the combined set {X,Y}. The combined set {X,Y} is the set 
of all the X and Y observations taken together. Similarly, let T,, be the largest test 
Statistic value, which can be found by summing the » largest observations from the set 
DOVE. 

It is possible that either T, or T,, (or both) are not unique. More than one test 
statistic value might be the smallest, for example. This could happen if the number of 
observations is small or there are many ties. To account for this possibility, define the 


numbers }, and j, in the following way: 


j, = number of smallest test statistic values 
j, = number of largest test statistic values. 

One way to determine the numbers j, and }j, is as follows. Order the set {X,Y} of 
n+m observations from smallest to largest. Look at the observation in position 7. If it 
is unique, then T. is unique and j.=1. If the observation in position » is not unique, 
then IT. is not unique. Assume there are & observations that are equal to the 
observation in position n. Also assume that the & equal observations begin in position 
n-r+1. Then j,.= (*). The number j, is determined similarly, except the set {X,Y} 


must be looked at in the opposite direction, from largest to smallest. 
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Once I., T,, j, and j, have been found, the two extreme points of the 
randomization distribution are defined. A normal density function is fitted by matching 
its tail areas to the probabilities of randomly selecting the values T, and T, out of all 
the (’77""') test statistics available. Let p, be the probability of selecting T, and let p, be 
the probability of selecting T,. Then p. = j, /("*’") and p, = j, /(72"%) . 

Next, let Y represent an arbitrary random variable that is normally distributed 
with mean ft and standard deviation o, and let w be its density function. To match the 


tail areas of the function w to the probabilities p. and p,, set 
PGi.) = p. 
for the lower tail area, and 
P(F2T,) = p,, 
which is equivalent to 
cea | =p, 
for the upper tail area. Letting Z represent a standard normal random variable, the 


above probabilities can be rewritten in terms of the standard normal distribution 


function by subtracting pt and dividing by o: 


ee 


anette = 2b ZS) ho 


Let z. be the percentile of the standard normal distribution associated with the 


probability p.. That is, P()25z.)= p.. Similarly, let z, be the percentile associated 
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with the probability 1 — p,. Then 


and Zy 


Multiplying through by o and rearranging yields the two equations 


at 


Ss 


H+ Z6 


T 


The above system of linear equations can be easily solved for the quantities 1 and o by 


standard methods to yield 


= a1, _ 2.1, 
Ht == 
tL. — 
and 6 = —2 “ 
2b Z, 


These are the values of # and o that define the fitted normal density function wy. 


Step 2: Relate the area under the fitted normal density function to the proportion of 


test statistics equal to or more extreme than T). 


Two cases must be considered, depending on whether T> is in the upper or lower tail of 


the distribution of all ("7") test statistics. 


Case I: Tj, is in the lower tail. 

Let @ be the significance level that would result if an exact randomization test 
were to be performed on the same data. Recall that @ is found from the proportion of 
test statistics whose values are equal to or more extreme than the originally observed 


value I). This proportion is doubled to yield the value of @ because the test is two 
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tailed. The proportion of test statistics whose values are equal to or more extreme than 
T, is approximately the same as the area under the fitted normal density function y to 
the left of To since Ty is in the lower tail. Since areas under a normal density function 


correspond to probabilities, the following relation holds: 


T) — 
gq ~ 2PC¥ST,) = 2e(Z=——_) 


where Z is the standard normal random variable. Substituting the values of and o for 


the fitted density w in the equation yields 


= Zl 7 z.1, 
T ya F 
a~ 2P(Z < b : ) 
i 
b S 
Ly me Z. 


Which simplifies to 


a ~ IZ < Z(1, = i) “I ZAM — Ty) 
T, 7 Av 

The probability on the right can be found by consulting a table of the standard normal 

probability function. 


Case 2: T, is in the upper tail. 

The same reasoning used in Case 1 is applicable. Due to the symmetry of the 
normal distribution, P(Z2C) = P(ZS—-C) for all ¢. Therefore, the resulting 
approximation formula for @ is the same as in Case 1 with the exception of a minus 


sign: 


alee jt Z(t, = a) 


C= 2s = 
De On 


1. 


49 


AANADAAI ANI A A NANA ANANANANANAAAANANINANAN 


CIOIO) 


APPENDIX B 
SIMULATION PROGRAM LISTING 


Principal Variable race: 
Aerts Output vector used in combinatorial procedure 
ALPHA...... Significance level for power curves 
APOWER..... Approximate power _ 
BEE ROR ce Approximate significance level 
DELTA.. 3.) DLELe hence: Mime ans rams 
DI VeE ees. Distribution type 
DRY Ceres. Data X and Y vector . 
fe OP Ries mens Exact power from randomization test 
EXACT...... EXact Siqgniiveameccmuie cn 
POE) eens Random number generation seed 
FeDX geome enn taee Unequal sample size variable(XxX) 
KD Yi. cesses (Same) (Y) 
Leas a colen coments Largest value of the test statistic 
NCOMB(,)... No. of combinations 
NX. ecceves No. of X's 
NM ees cheers No@of as 
eoaee soe es OMGHH CS Gval te sGitMe = Ges lms techies ba 
ROLE Ry. sue Tolerance on equality of sums 
TPOWER..... P-Cestepever valle 
GVAL. «ses tes UP Smeg t heance revel . 
Zilles. OB 6 Quantiles of the standard normal distribution 
ZS ececeeces associated with the largest and smallest 


values of the test statistics. 


KAKARKKAAAAKKAKARKARKARARKAKARKARAAKAAKAAAARKARAAKRKKKAKAKKK KAKA 
Program Begins Here. 


INTEGER NCOMB(2:15,2:15),DTYPE,H,A(15),RESP 
REAL*4 DXY(30),SN(30) ,E(30),U(30),L 
REAL*8 0,X,ZS,ZL,Y,P,APPROX 


LOGICAL MTC 
Read in no. of combinations from external file: 

READ (UNIT=7 , FMT='(110)',ERR=1) ((NCOMB(I,J),I=2,15),J=2,15) 
If PRINT © 'ERROR IN READ.' 

S1LOL 


2 PRINT *,'Enter the following parameters:' 
PRINT *,'Alpha' 


READ *,ALP 

PRINT *,'Distribution type:' 
JS IO TS 1 = Normal! — 
sie oe 2 = Exponential’ 
PRIND 2m SS Umer ror: 


READ *, DTYPE 


PRINT *,'Xmean' 
READ *, XMEAN 


PRINT *,'Xsigma' 
READ *, XSIGMA 


PRINT *,'Ysigma' 
READ *, YSIGMA 
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PRINT *,'Ymean range in the form YMIN, YMAX'! 
READ *, YMIN, YMAX 


PRINT *,'No. of divisions to divide mean range (MDIV)' 
READ *, MDIV 


PRINT *,'No. of repetitions (NREPS)' 
READ *, NREPS 


PRINT *,'Range of sample sizes: KMIN,KMAX' 
READ *, KMIN, KMAX 


PRINT *,'For unequal sample sizes, enter KDX, KDY:' 
READ *, KDX,KDY 


ISEED=47771 
TOLER=1,05—5 


YSTEP= (YMAX - YMIN)/MDIV 
IF (YMAX.EQ.YMIN) MDIV=0 


O 


x*x* Begin Main outer loop: vary mean of Y while holding X fixed. 
BOs 00S ,NSTEP=0 ,MDIV 


YMEAN = YMIN + YSTEP*MSTEP 
DELTA = YMEAN - XAMEAN 


OQ aAAN 


xX Next loop: vary sample sizes for X and Y. 


DO 7002 KSIZE = KMIN, KMAX 


AA AAN 


'(Start with equal sample sizes, then vary by KDX,KDY): 
NX = KSIZE + KDX 
Nk = KSIZE + KDY 
NC = NCOMB(NX,NY) 
N NX + NY 
K NX 


AMO 


'Initialize power curve counters: 
= 0 


KA = 0 
Ka 0 


* Start iteration loop: 
DOw,001 ITEK = 1 NREPS 
'Select appropriate distribution 


Cerro, 102,103) DIYPE 
101 CALL NORMAL( ISEED,NX,NY,XMEAN,XSIGMA,YMEAN,YSIGMA,DXY ) 


€ 
G7 / / Data pnpoxt Paci ty < 
C/T READ (U IT=10 FMT='7F5.1)') (DXVCL) =i) 12) 


AAD AANM 


TO 200 

OZ ~CAEL pee bab eD) NK NY, AMEAN;YMEAN , DEY ) 

103 CALL UNIFRM( ISEED,NX,NY,XMEAN, YMEAN, DXY ) 
c 'Find value of observed test statistic TO: 
CG 

200; TO = 0. 

DO 210 IX = 1, NX 

210m TO = TO + DAY(IX) 


c 
E 'The following section performs an exact randomization test of 
(s Significance for the null hypothesis Ho: Xmean = Ymean against 


2) 


€ a two sided alternative. The sum of the X observations is used 
C as the test statistic. In addition, the largest and smallest 
€ test statistic values are found for later use in the approximate 
C method. 
c 
C ‘Initialize parameters/ counters: 
C 
S = TO 
L = TO 
JS = 0 
JL = 0 
NLE = 0 
NGE = 0 
g 
C ‘Generate all possible combinations of the elements in DXY() 
C taken NX at a time: This aoa 1s given in Nijenhuis, A. & 
e Wilf, H.S., Combinatorial Algorithms for Computers and Cal- 
Gs cCulatots. znd Ede Academic Fressjl97S gmepp.ee2@e33. 
S 


MTC = .FALSE. 

DO 400 KC = 1, NC 
IF (MTC) GO TO 40 
M2=0 


H=K 
GO TO 50 

40 IF (M2.LT.N-H) H=0 
H=H+1 


M2=A(K+1-H) 

50 DO Vsae =! H 

51 A(K+J-H)=M2+J 
MTC=A(1).NE.N-K+tl 


ce 
c 'Find sum of the X's for this combination: 
C 


T= 0. 
DO 300 IL = 1, NX 
300 T = TS DAY (arr 


C 

C///'Test statistic output facility: 
@ WRITE(10,69) T,NC 

C 69 FORMAT(F10.4,2X,I6) 


Cc 'Find smallest & largest sums and count them: 
€ 
IF ( ABS(T-S) .LT. TOLER ) THEN 
Jo = US shes 


COmDomsn0 
ELSE IF ( T .LT. S ) THEN 
1 


JS = 1 
END IF 
C 
310 IF ( ABS(L-T) .LT. TOLER ) THEN 
Jb = gi + 4 


GOTO) 320 
acer) aE A T .Gl ae eeeneN 


JL=1l 
END IF 
320 CONTINUE 


C 

G 'Count # of observations <= and >= TO: 
IF T.LEeLo NLE = NLE + l 
1A) T.GHa20 NGE = NGE + 1 


400 CONTINUE 


COO) 1.0) 


'Compute exact significance level: 
IF ( NLE.LE.NGE ) EXACT = REAL( 2*NLE ) /REAL(NC) 


o2 


IF ( NGE.LE.NLE ) EXACT = REAL(2*NGE)/REAL(NC) 
‘Perform approximate method: 


Q = DBLE(JS) / BOLE ING) 
CALL INORM ( 0,X,IERR 


IF ( IERR.EQ.1 ) THEN 
PRINT *,'Error in subroutine INORM'! 
STOP 

END IF 


IF C JS.EQ.JL ) THEN 
S 


= & 
ZL = -X 
EES 
ZS 


0 = DBLE(JL) / 
CALL INORM ( 9, 


aaAAN 


Sree 
, LERR 
THEN 


IF ( IERR.EQ. | | 
rror in subroutine INORM (2)' 


Gi—- xX O 


Y = ( ZL*(TO-S) + ZS*(L-TO) ) / (L-S) 

CALL MDNORD ( Y,P ) 

IF ( P.LE. 0.5D0 ) THEN 
a ghPPROK = jon) is 5 


APPROX = 2.0D0 * ( 1.0D0 - P ) 
END IF 


'Perform standard t-test: 
GALL TiEST { DXY,NX,NY,TVAL ) 
‘Increment power curve generators: 
IF ( EXACT.LE.ALPHA ) KE = KE + 1 
IF ( APPROX.LE.ALPHA ) KA = KA + 1 
IF ( TVAL.LE.ALPHA ) KT = KT +1 


med Pee DTYPE , XMEAN , YMEAN , DELTA ,NX, NY, TVAL, EXACT , APPROX 
1000 FORMAT(I1,3(2X,F6.3),2(1X,13) ,3(2X,F7.5)) 


7001 CONTINUE 


ANA AANNN 





‘@) 


'Calculate ave. power values for this sample size: 


REPS = REAL(NREPS) 
EPOWER KE / REPS 
APOWER KA / REPS 
TPOWER KT / REPS 


ANNA A 
Wout a 


AANA 


‘Write power curve values into separate file: 


Ae eee DTYPE ,NX,NY , DELTA, EPOWER ,APOWER , TPOWER 
Z000 HeRtear(s(13;2x),F6.3,3(2%,F7.5)) 
@ 


7002 CONTINUE 
A002 CONTINUE 


C 


PRINT *,' Another run? 0 = no, 1 = yes! 
READ *, RESP 


8 


AAA ANNAN 


aa 


O) aa aq AAA AANN 7 


AAA aNnnNnYmY 


ao 


Ie 


2 


1 


2 


I 


Z 


IF ( RESP? EO ce 0 


STOP 
END 


SUBROUTINE NORMAL ( ISEED,NX,NY,XMEAN,XSIGMA,YMEAN,YSIGMA,DXY ) 
'Generates normal X and Y samples. 


DIMENSION DXY( NX + NY ),SN(30) 
NGEN = NX + NY 
CALL SNOR ( ISEED,SN,NGEN,2,0 ) 


'Generate X: 
DO 1 IX =1, NX 
DXY(IX) = XMEAN + XSIGMA * SN(IX) 


‘Generate Y: 
BO Z 1Y = 1, NY 
DXY(NX+IY) = YMEAN + YSIGMA * SN(NX+IY) 


RETURN 
END 


SUBROUTINE EXPONL( ISEED,NX,NY,XMEAN,YMEAN,DXY ) 
'Loads DXY with exponentially distributed X,Y. 


DIMENSION yet NX+NY ),E(30) 
NGEN = NX + 
CALL SEXPN ( ISEED, E,NGEN,2,0 ) 


'Generate X: 
DO 1 IX = 1, NX 
DXY(IX) = XMEANXE (1X) 


'Generate Y: 
DO 2 TY = 1, Nez 
DXY(NX+IY) = YMEAN*E (NX+IY) 


RETURN 
END 


SUBROUTINE UNIFRM ( ISEED,NX,NY,XMEAN, YMEAN,DXY ) 
‘Loads DXY with uniformly distributed X,Y. 


DIMENSION DXY(NX+NY) ,U(30) 
NGEN = NX + NY 
CALL SRND ( ISEED,U,NGEN,2,0 ) 


‘Generate X: 
DO 1 IX = 
DXY(IX) = Meee + XMEAN - 0.5 


'Generate Y: 
DOeZeiy-— 1, NY 
DXY(NX+IY) = U(NX+IY) + YMEAN - 0.5 


RETURN 
END 
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SUBROUTINE INORM ( Q,X,IERR ) 
Pinas | rOumlne ponOnees the inverse of the normal probability function 
shee a modification of the formula ae? in Approximations for 
Digital Computers, C. Abed a oS eas, . 
The modification consists of the addition of a sinusoidal error 
reduction term effective in the probability range 10-9 <Q < .5. 
REALSGeO, 5 Nave Tia 
'Is 9 in the range 0 <9 <=0.5 ? 


IF ( 9 LE. 0.0D0 .OR. Q.GT. 0.5D0 ) THEN 
IERR = 1 


GO To 10 
END IF 
N = DSORT ( -2.0D0 * DLOG(Q) ) 
Y = 1.685085260D0 / DSORT N) 
T = 2.515517D0 + 8.02853D-1 * N + 1.0328D-2 * N*N 
B = 1.0D0 + 1.432788D0 * N + 1.89269D-1*N*N + 1.308D-3*N*NA*N 
X = (T/B) - N + 4.434009D-4 * DSIN( 1.1493099D1*Y - 5.789591D0 ) 
IERR = 
O RETURN 
END 


SUBROUTINE TTEST ( DXY,NX,NY,TVAL ) 


‘This routine performs a standard 2-sample t-test for differences 
in the means of X and Y samples. 


DIMENSION DXY(NX+NY) 


'Sum X's and X*2's: 


SUMX = SUMX + XOB 


SUMX2 SUMX2 + XOB*XOB 
VSames cor Y's: 
SUMY = 
SUMY2 = 0 
DO 2 J=1, NY 
YOB = DXY(NX+J) 
SUMY = SUMY+ YOB 
SUMY2 = SUMY2 + YOB*YOB 
DF = REAL( NX + NY - 2 ) 
S2P = SUMX2 + SUMY2 Bey eee ox) core ’ / DF 
ae ee Es CSUN NE? yy SOkt Gao2r (0/7 + (1207NM) ) ) 
T — 


CALL MDTD ( TA,DF,0,IER ) | | 
IF ( IER.NE.O ) PRINT *,'Error in subroutine MDTD' 
TVAL = 0 


RETURN 
END 


oS 


11. 


lee: 


LIST OF REPERENCES 


Fisher, R.A., The Design of Experiments, Oliver and Boyd, 1966. 


Box, G.E.P., Hunter, W.G., and Hunter, J.S., Statistics for Experimenters - An 
Introduction to Design, Data Analysis, and Model Building, John Wiley & Sons, 
New York, N.Y., 1978. 


Pitman, E.J.G., Significance tests which may be applied to samples from any 
populations. Supplement to the Journal of the Royal Statistical Society, 4, (1937), 
119-130. 


Conover, W.J., Practical Nonparametric Statistics, 2ed. John Wiley & Sons, New 
York, N.Y., 1980. 


Oden, A. and Wedel, H., Arguments for Fisher’s Permutation Test. The Annals 
of Srarisnics, (1975), Vol, 3,1Newz, 518-920: 


Lehmann, E.L., Testing Statistical Hypotheses, John Wiley & Sons, New York, 
N.Y., 1959. 


Richards, F.R., OA 3103 Class Notes, Naval Postgraduate School, Monterey, 
CA., 1986. 


Soms, Andrew P., An Algorithm for the Discrete Fisher’s Permutation Test. 
Journal of the American Statistical Association, (1977), Vol.72, No.359, 662-664. 


Garey, M.R. and Johnson, D.S., Computers and Intractability - A Guide to the 
Theory of NP-Completeness, W.H. Freeman and Company, New York, N.Y., 
Cr 


Lewis, H.R. and Papadimitriou, C.H., The Efficiency of Algorithms. Scientific 
American, Jan. 1978, 96-109. 


Edgington, Eugene S., Randomization Tests (Second Edition), Marcel Dekker, 
Inc., New York and Basel, 1987. 


Boyett, James M. and Shuster, J.J.. Nonparametric One-Sided Tests in 
Multivariate Analysis with Medical Applications. Journal of the American 
Statistical Association, September 1977, Vol. 72, No. 359, Theory and Methods 
Section. 


56 


PTA DIS mRIBS TION LIST 


Defense Technical Information Center 
Cameron Station 
Alexandria, VA 22304-6145 


Librarv, Code 0142 
Naval Postgraduate School 
Monterey, CA 93943-5002 


Chief of Naval Operations 
(OR 09s) 

Navy Department 
Washington, D.C. 20350 


Professor F. Russell Richards 
Evaluation Technology Incorporated 
2150 Garden Rd., Suite B3 
Monterey, CA 93940-5327 


Professor Harold M. Fredricksen 
Chairman, Dept. of Mathematics, Code 53 
Naval Postgraduate School 

Monterey, CA 93943 


LT Derek H. Hesse 
37 Gorton St. 
New London, CT 06320 


57 


No. Copies 
2 











Piri i 


RY 
ia et bi SCHOOL 
haO.s ig pp 


f, VALIVORNIA 95943-5002 















ete nF senha o eet A ere 4 Ate Sig ae be os me Ma Ven —— = eo. . 1 Ole ' er 
aT sH52385 ' ‘ va : c : ' thy 1 te ; Bebe ie 
D yl i ae ' i . ; a . “4 . ' 
Practical applicabilit = 
O 3 ' ' - 7; -03 ‘s , , 
] | } min 1) cu exact and app . F ' P r aa ; . 1 : 7 - : ne Si 1 . 
NA ETAT tl Cre ace re ae 
NAT | H) ii WH il Hl Ns ae Beta 
METTLE VMG HAT THEE VUE TATE UO He A HW ANT | Bas 38, a SO Dal Seer a2 
ULL | WG ieee re, Pie Je ae 


ect , 
on nr eaten a Ce ities Site gwen 
Are eau nctenapdte cece seiner 3 2768 000 75273 7 aa ieee ye 3 













































































































DUDLEY KNOX LIBRARY | : 3 
' aon 4 ‘ s 
< fay c ' 4 ' 
ho’ @) RAD ISRE IRE NV GeAS AT 4 = ak : eee rt ks t 
ap Bde OR Etett Oe Ty. or) ‘Samnaadl + r < ; 
2 ha Be Mt ake tpainl tad Soh tot ted betee 1h Tearamte: Yoke t tat ee ae 1 ‘ + nse a a fot . 4 a : é ; 1 P ‘ , 
4f > otaeMMd fn fa® etrtiet Atte, @ * Dham tp hs Met S$ atod.% s 4 Le, ert “ tps . + ¢ 24 tee! 1 
4 tv veh? pareyane Ponicy « saath wynet cht? Scho tote Ge O% Who Bete & Vyteke teste! at df on) 7 te re aie - F 
. 4 oll tee Re KAKA Sohn do the bebe 0' 2 Pe "be be ee h-S cate? 80 eR 6 ae) awe ¢ a 23 2 . 
Sas bp Oris 1A? het)" Bn d1 ee tobi Sp 8 Qrtalt r) 44 al yi s ful ‘ . . 814 ‘ te 4 ar Ey * 
i onmanta tee FARIS 6 109% POE LTE eee aks ens how Sat ‘ re e ® oe is f : i 
eh eC itae PIRES e918 oF hy ©. Ma Mh Soh. Se PTC ese 9.502 “ray : , ! ain e ¢ 
vy Hie Se 4h bal oes >t Or Orh gt Our pte hm Ct , he Mta® inf Ants A kel Ci awe 
AB *yqelht bh CAMP 5 O43 8 hk Sole BSG Ml wit edvae tts ¢ (Ls mia o' 1A "a. 4 > 4 4 - 4 i . 5 nee ‘ ee 
dg Bale tod > Pent al etn ™ ts Bo e* @ vet focb,0 £9 4 Sah as.chge Sot Rat > « 3: : me ; i 
Meh: Oe yee rere tires 2, Ab Hh tah Roy ae? br a ore i ¢ r Peel F . : é ’ ' 
Ah SORT OEE MIs Molt DS in NOME’ S fang 0 Went 0,8 tet oF “phe! Foes phe 8 bof sot et ' . aka 7 
aoa ete cn te hadae 9 Qeeted te Baht 1 y J ¢af A 4 Fitet fueeere 2 ** } © apa 4 i i a 
a Sahev.eckar @ ee wa 5 by 4 aig Oe . og hy #4 9. wt is 4of feb s wt 1K & fg ‘ ' . Pan s + 4 5 a 
, dh, a, 20 NOS heel ARs Ow” be Pat cause oh bs 04et AE apt hh Favth bt sGot ; TRAV t bX FO — 8 Ss gay +a ee ; pee 
2 } mad CARLA By tealtinh nit shhe eS rate tata oe barmeTG Ss Nabe Melt bb ry fo pet 9 LntetisMre tes SVG ® fomho” 8 (= Te) ee te . ‘ ‘ 
um hub es Pa er np 90° cheestn bP suigterm se © Sof mMat ahs ott etait 1 04 db e6 seh edb sey 2-60 Caigieht Sheth! Ch! Binks , . ji a 3 ete oa . jee 
La nhc tabs has nnarad afm pers Job tic ttaty.Ratates &fambglae hy tm ‘a siya nt B18 9 8 ' wiaet = . F a ate 
4 perp ae ee ty 7 Re tixtenatobeds abe Oe ee as bee Phas a® bo Do Gd ohn Jeata’y sat 3 - 4 7 ¢ pei! ene : 
= ier oe oh ets arnt pie 4 ee OA PT me ety hy oteP gt © 6, 1lad cae) 0 amet aR Ri bats P ° ‘ ° 
a bet Ace MAM: Bem Resto pee tad 2a: Sep ri mate wes rly or ey Perey PY 2 cae ats ee fae j ‘6 ‘ +6 : aes 
ay : wer meray Pl APT ed se! a ve nA f 1 ge te pe é » r ge i ‘ 1. ,ee é as 
od MME haamn he 4g eh CA 2 ofhba ts Lehyp& Se tate hgtil » 1a? Beh ttt D ' * Rey Fae pe teeger « F > rs a 
§ obnnsig iy aah gan Deh Sa thee wrtehllae Wendt BAS ® orhrat he @abet tg" . “ . . ' ~ F ’ ry 2 
on ta AA Sek ns POLE erm h Sptete! tatu nara’ eh Seta itm A %48 ies Bb os! a! & ‘ qs aad ; ‘ P 1 ' $ ' 
Per et pe Se Te Liked tor er 4a BS OME uw ® Sortiehie Ary bath # hat : : ar ‘ 
pgm a On cos ober Sek CL pele toe Hebets Tack 1R IO G45 Dobcity dt?! te % B Paty S+0 p09 me 4 Oe : LP nee Sp a ce NA ' 

' ate ? Dg As nhs Oat Ae IE gle nbcmiei Bee reke ethan! “regent nati aeg si ot ' ros bie "! ¢ ue oe ial ot eas 
PGE A (mot oS oD BA OMR ~ = coektt ib Ngreeteer ta ert He Peed hede to ethgan! Bolt ESE A 7h te 6f (te t Tua he ie oe y ae eo Rea ; 
peer Pep is “ye oye y yt ai het ert Hap ares Sa ng Fangs 08 4 Saf BCT Pec bie? oft Wate Ue ob fre Ey! OM aiease Pe 8 Rr OM 5 1 ree ve guage. Os 
. ant Oo & Soe Oe Pen abe re A hackevtemmben pie FA bony gt 48 Pita sate @,0 8401 x Mia? ett we tg Beh Boh ach net gt de re a Pane ; ree 

bes ot o>, oad eR rhe 9 Me thse ™- Aggcegtengeas’ o met gee Watnk SIGE AAAS Y + ° 6 iz a ies et he a on Th) ' i r - 
J = Pepe ey eat Ts Gian daa etetees, Var Pa Ret® manrectogh-y2 ohag rote -@8 Sofia med phg@ 10 pit of Se be “8” tos a ‘ gis on wre 7 5 A , ae 

re ree bat peace. 1 poperats Sate Le tgre €; ab Pai mbes + het > ak? ph See edabolnt sta gh aa * euch § be) © es 1 1 1 A 5 nk 7 Pie tet se 
dae tse it rset OY 94 f Node Ore g: 2 2 pemlagl Meat sts need i> ve 9 nee” Of yt Sctahaio She! 8A F oan D mt Oni 2 ee SO SC OF Ar ee eS 5 A a Fl 
eongenturats - Fate ecasn of Cea! <a je ear w'gha tow Ra% tevpy"s fre 4. P28 e Aig 2? bop reg & ' ' . e as 13 ' i D P 5 5 = F - Hs 
Sasekdl yf ; Ast mebs hates bo bait 1M! Mbt 0 Beate Fos By wA0 8 Metin As hKRnen Fane Cr rr i Oe ' 5 ' soe 
‘ nt PRET Or te QO Gop s a S38 afer h Jebermmada ted © Mal Ym GRSucl ty 18 Sth Roms atigt ‘ *. 8 ag Sls 8 hae . F 7 ‘ Fi 

iMgrd ty % brea? .OAntb ont Hereiied are rrs Cetera UL kL ee tee tohet: $ 1,3 3 weno 6 rt ee Na oe zs ; ; J . a : f) 

3 Fab ye i ae eo 5 et 4 ' ‘ tek Sites . : oe ’ 
' ‘ » * ‘ : ‘ 


2 ei eg DSO 94, oe” * MN aL ns Qh eerste ber ti» beth 














































an ong 
“a sew: ee ny Sete =H i e oh erent & QBeteata Mbp Birie tubes! Sg Ooeh hy MAI 0.4 raty leer . 
: ee ast A « Ahab ce oe BA CAT wePBee ta’s tebe? l i bey Seen 1 
cast ee) 424 eh a tem ' 5 Tan og Phy rane a ind \e & ’ Ht ; 
Aarne Pps Gq BT ‘ 5 51 he aa dt Setter rite by he by peeeteeete PP eT rare bg as O° Rats hgharey OSGi ¢ "auhte 
ae 1® darn ae 4 ert arg 0 ohn th Slee Peisetey, ban th oF Aiaheg gla tte were Wit wr hd gf otek 
Sa * Or a od ~ : 
4 Ps etn hy 1 heen Seesye ema be +. 4y4 By Sg ne chgvety tater? 
2h. B12 A« a of | ANN 5 # 
Osh Pa BraAg & trate ee ee Se 





anh hgh: BA 

gents BARS gh Re eth 1b 28 ‘ 

oo mgtoia ig + dae ag Met Lresth peas tages of de tale o Fatgse et 

» 008 F OF 9 pho beh PD qeetabou seach En Sgeedgi gts 7 Fagg thet 

oe arenes | bs AB! na hob ate ds | mt 

ot peat DAeae St aM OA 

0 neg eteae Au ee 
erg oer 4 mafat, 10 Ptatinge gh Aarigtetgey 107 ot | Ree 

ag mpage g Git Uonhte B88 Ke iatctsh #-# sind ge, 

Ar ee 1h 4a » Cop EME gee he ch 






Pe os yh est sot 
ghactghcca ph ten Hee 
Matewlarrint wer ft 


























enghonretes 4 
ye ee ns wae Zt 4,ee 
am &4-0- 0 o*. \ ? whe D gtd hg Mates OS 8 Bk es 4 
p att blind LB ar LF be Ey re ert. jate 
Ps a bl yrike® er ea a! +ar 









et ede! off 
A 





met &04/ ts * : 
ee" 4 1 Fgh eettat beet eee?! 






foe 4a ye 









































bonten? o” 
jeererony so a fase pret 
. st "A om J s 
K pba oo 5 ysheticls be, sad iarog tAednbte aeei?> ©.69 One hghonot nates ¢ ‘i 
wan oP gc Ae IA 8a ph oh BA Ra aes Mee we he peae get spbets he 
oldie we nal sds o nt 2 9 oAEAe : ‘ ‘Cae 5a ie LEE Bren : 
y page - ° _ ten e , : + ewstae eee - “ ° 
On vey 652 Soa ee peers 7 hs rm ? . f Sarg es “ ae Dis radish 
SLanbe. hand rm Mateiet ge eal 
2 fot te maid's o wh opetst 
Side ds-ai ufte® daiDreregadl emhcig ele. @ Ff + ag t Aone 4 
sivve 


8.8 9' 07 gF 4 02 
SOF gingtontithe bade? 
; 


S£5 ee", te oie 








grids an A bsk 7° 
& a 3,94 yep F8 9 Hi a> me e" 
Fee Z : 6G PPE er a bb der eae SA pha Pug ent?! % 
Ne . Dn t0se Wor 9% oe jars tae PRE eS P+ ead 





P~ 











































” 
oo a PP te hers , #; vit Ay te 
é B 7 Ped erie o J ¢° 
a The: Da f eS ‘ chee ere 4 i phite.t « pe ? ) Fee fipoie vem . a 
Oa? +, f Parfe A ebr8E Ss hg “Me 3 bys OU 
ake og ae a rare aye bbe ah eet) ee eb tad! bebe e 8 
ae & sad “ ar aayl pier ene? : fee St ee ee tan he 
me athe 2 ieee tock adlinkenl tng S > Reet: 
ell an aciea 5 cheap VB. Ku ols A » phigh « Wa 
Boge Ga Bern Bot 8 " Bh ban awbiecdatatares * ipa is - 
; aa We gt | pees noe eg eagtnl ghia Aphotvzntesy + on 
Br paarwsgee 6 Fret Ae Vey Sort os Ut gh Fee pase ae , 








ain hid s4 
P trmatgnl eh 


igh 
































Bt daAgisd ds 


¢ - > « pe 
Sat oF -+ ae aN } 






Ro tege ated oP BY tn I" v big 
we - oe Pe Pras me» b> ef 
Dt be RGD AIShSD' -STIBLITN YL, 2D bot tals’, : 
; Se as 


ona’ 
si 08 











i me are 
pe eat du 
Bide bt 2 ix 
an-MBSS od oe Fe © ee 
uty 4 oe Cy mee 
Penol Eat 4M. ees te F 
~ aM °% Ae"s 





4 he 
opty Pe Na Bnd Bs ot eh of “ 
: r 5: +5 


' 






f¥e, ’ ’ 
gir eT 're 
"be ers i on : a 
Cty re age 8 Pati ere ths; 
f.- Vb wt 3 ¥ %. : 
wae “pe * % : ry . ts , “4 ? ta 
e7rsek wt eed fares ee ae 
g° t2°1 
’ Less halt on 
fades . " 852 5 
hae rdt Fe 











































































































































































































































rT? 
i < 
“si 4 ; Per ! i op . 
122 bi a MAE OMS is i F 4% 
: re va cee Fa sera cn os oom oF Ue a 
be Fit hd seeee WIVY pd as J 7 sp eo ¢ 
pie Pameni ys wise pewke ALT YEO oo. airs Date { . aie Pry . 
is ¥ (rp ae te) Seer Ta PRT toto jad » Be aa? a ‘0 
DoF Fa PUES PLY APY 2 ye SSerss oI de th : ap ge NY “on dre 
Pie OP ek atl eed ot a j Soy fetch heoat | 8 eee ee 
gee feeb > > eye a pene sae ’ ey aM ae 
oom e" awe a = sae i ‘ td is Cad f 
ve aba” = ya aw wet, ye: c > . , S90? gure s tga r) iss 1 
netaty 2 fe sate at, VI YT > ? yt vet a? Sn eta: wr 8 é .¢ ‘ : ; 
orrenten Bo dh A Vtey jaci ae r) ; 5; Pa bed ; t i of 3 
. * as 1 o> 4 P e a y Ges eS C ' ' 
pene ou oA eeias i3enr ve ae ale ae AE A ES ex : ra ae os 
ote orgs. 2 Og tt” 2. 5pt¢ f #° . ‘t ests fl oe bo. ay , , 4 i 
aye Bs tars ee Arse SA 9* of ae Ja setae? t tp ‘ ' : ° ' 
-* qn ener 0 nm wwe oh of )f * ? pa res ' 1 * Ar 1 
«Poe sere ogee ae sbpe OME PIDEE oD Es ahr i ob at et Oe “ll Se ae ; ' 8 : , ; 
ror yatel srw Sea gevey? ou 7s tel Ae 2 ae | eet “? ' ' : . 
sheet. pu Hak 4 AEA LAS ay et tele SE oye ist. Pe kd el EE U : - ore ' 
ong go shyyoes 1y ee Es wees att oor eee aa nag auted : ais s fo. ' ” ‘ ' : 
fo P21 Oerawah Prt as en UN eh? wy EF Peer ts Be RE cok & oo rh ° i 2 wri hes 4 . F 
Rn ge Aree ee adie pear)” Or A ea cl i SA 7 ae s SUA) 5) is ie set ' 
ir” age to og te rev 2 weet oy Pet St, f OM WwW = etiv ‘ 2° . r, d 49 og os Pa ' ? s Py 2 
amg tere 8 FETE OM Fe “Peegeower'y LS bot) of negiee pe tse keh ace Ae Bate hs ee Es f met ia6 ' ‘ 1 eae ' 
° Sey pene ek bs hy oot adel peraa: y* A a HL svi yy obey a i ' é $i Ce era : ' es 7 
. me ‘5 o we Syste oP test » ' bin peeves yrs” © F8 » et Sy . “4 A Sc er ee t ; P ) 3 
write TOpPey yee petpeets sae TIS FY ote te) - a bees ’ i Ny tee aed : f ' F ; 
a 2 bee TEV EES oy ei dria bee See 2) aC By Nee eae va i? r ie : : ‘ é ‘ 
*; teeuit ogee Re 2 yrsgver s42 iY, goat (edie Wag Yejte! % vo? yO P § repr oo ‘ : x ; ‘ 
Fey Pay Sthafakis He mares erp. Woy tees? we ee OE FS a , 7 - 4 i r 
we yt ey Ste ae i rel Nyy? Sis? jesie wyiy if ore raes® 7 % ‘2 fr ts6% 1 ‘ ae ‘, ary = rs 
: ureestg 4ee aa Ti x s F cs PF wt . re efx? 2 rere cad . r r 4 FL 1 1? 2 . . 
sigvetaptzs FS TUS VPePEees>.% 1 Meee OSI Se erat gl Se “4! : "¥ rer F a tS <1 - , 
oy WS ene La Aa Ja BAS ae / see Perera ee ’ £ ye a . 
° Sete w Tie fae a) Phat | yikes ate hee Le P% eae ie $+ ~ ved 4 ‘i ‘ te at 2 ' 
atente perry ye atte ur a UJ y a sthers Y pete sw 8 = sete he 5 oo tee * r se * i ; , 
ey ye won 70%? > cere S Fg ts ray yi get ye ‘ 1B, ut we . ee 5 _ 1 toe € ‘ e ' ' 
weet Faye geahg: OP nee yt remiss! oer Phin Pr) Py ’ Ee 8 ova" tv¥< 88> ri ' ‘ ‘ ’ ’ ' F - 
ee La aR wy gf Wo 8Or Ee CO daha tte babe reese as ; el i H 1: « wAgd awe : , ' ; 1 
vu seerer! i P, ‘ i , x “ $ ‘ ae 2 7 eth es A ‘ a : a 
La laa or coneptess UWE HT? PPE! Se Aredeseg erase olf yg! td 5, peg 2 2 * B sf Mrte Gh, 8 8 j Ma 1 ee 
geevorenenie y* fesse phere guerssgsseepy! pf 2% FS Ab » © AAP pees ; - Ca) i vies + we . 
yng ’ pled: Spo meyer aye heveuepy AR ie OP ee YT) y fet “RS? *e ‘ « t ef # & see ' + #4 P F ' ' ‘ F 
‘ é ee Tragqeceyens soyh $F omy 5 owe ete Oe ors ae i re: 2 wet PP «fF wie j ; a al oe tent 
Sfrgop' Fae. ere teyigi a? a or'ece by cn ut ae thy . a G79e ath oe 5 ' y ' f a? a F A a her U 
enue ¢ 2 gs a, 128% a, Ee ae 4 ony . : rT] Fi : ‘ 7 
(ata soit tira NS ae ORM aE Prin of. ‘las bey re 
tate | rr ca pO el OS eae ay 
MY emtete #4, “ ‘ iy . 
eee et ome ~~ ds ’ s ' so 8 ' J 
Ts TL rend teed Capes ay os? eau ° es z 
eee ret i eee ied + cae a atlas sp +4 A “8 ° 
| 0° he-F ougig ety Pry a ge ee en ‘ . ' 3 7 
ig leet 9 ¢ yee rn eee bf er te ; oe eh SMP oO 18, : ie ' ee : nae 
oro oma p Wet obyty tye oA RM feb hole " sera ses 
Le i he jeanne whlyb ier egsee ger? wivecnrys aver sy ty rates re 8 ee tke ' he! 
oP omuiny poo HOE F emily y ve yom Pitney sady rade oO wogral RPO vee gees com ge co Pe fuk Pm ee i ‘wate era 1 . 4 t ° 
eron nme Retgige PAY pale goocecorreeets ©T>: frqeqsd ve f oN ee ae me . Ty An moro. ’ 
pe oo popere erat @ry00 10s 0 9 GIOF. “ary Voupays poraegeer yore rect ee at ple ina Bie peep Sabyede yey 9 IS eypepme wer wre * ‘ ' ' 
per glee 9 te e Faves omega ye onered ees 70 ey Seyrg a Fuse buper fry paste yes ye ewe, Vey UE De tigre a f $4 0'atte. = , BOP 7 ie ‘ é ' 
peer pete Re Te Qreee: ging! 99 ety thie e 1 019m” sg iatese aoa tavg tye ar avete Fajr oe ors. i Sedat acabys Mee) 5 8 ghee ane bf ’ ) oo «866 1 : = 
eter epetterqe vate gter o8y runes uy wap ePasrons gen Cooryd ge eden tary sf east sedge tee 8 b qaye , yiset ot] } Pee pie : 4 . ats 
a popes en oe Apahg rah woe yr 8 OPP rm Ars ghosts Pes h rey. te er hoe ee = i : 
2D emery ce" te we O88 cheese Pe) ppaaee orwy arb La _¥ Pos 9 wef C) sf ry ~« ' 3 
19% wa Dood Hp. s 1 ooprape G2 9 te te Su nyea tiny bam Sizes eyed emer sere atg : A, a Ey eebyeal ye gy ereee < ‘8 « % te! P 3 
Sirosiewwiys ane OPS Shileemsoheseay rey Ser eope wih prapemdge eae) ene HE tow dud scuge id 8D Ba ye 5 ire OT sthsenseng ste o ee Fis ae ; 
* oF Pal Sep grees a °wp Yes veges etapetihreg see dptve, eee) om hehe ob ah ae ' : della ® é ; 1 n ) a ‘ 
eT tevressaee Wipe coyeee bh A hhh fe Toate pratanes oF reeds wae sate taig ptt fe bsgene’ yp fs Re Fein pe usgelast ye, ena 8 Z ‘ , ‘ ; ; : y 
ce er tqonse ae by: joqueeg gr eco oe Gas e+e Sippered yor 2 | tere Pears Ve ey Tey. spvéy & veya! a ‘ev foe 4 uae . ; ' : 
veer e eet anol gine eo") ©, sur Or costal? & add Peary ys ’ 148 ong 8 A Hey 6 be y@ Shs erasest ev te : . ark. Iwas ' ' ams ; : 
pores pepueew—sreurare® / wSettes perha-ys goe_e Hoorderacosy™ 7e ve LF ij “ reP pare ae AY sweep! st ik Tyse bre eof " . aa t ‘ t ; 
rereareevy ctprefe. SI¥e PUPS pee hee pb i icacde she pod ep dys om gees Che VY '. ee ait a) . 
9D EG Pet GION EH oq ograbered ade Oy UF Oc eps ace ed oo peat sate to8 op 1’ He ye Le Pe ahs ody ntgvete Fie &. sPepve 1 ‘ ue ' ! oe mst 1 ‘ du « - ae 
fat TON oreo ee dcseprem Peha) eieive “uh frtoar ce ofr te Oe Nile ae e ac Seg peed VS? =e «4 > , 9 ' rat 
eres Geer ee” Aare 3 86F Bo Dh ow Oe 2 Oye iaé dae shad Hy oft pbatmenteeWgrh, sb 9 Ose) OF | see soy a die ae inks v ; é 
Me. ceme eed ee emp ene OP Oa) YP IAS ~~ e rie tare yoote yt. fae We 6 hina! 1 f YT] * a ' é 
eters oqo r eve 1807 e goa Oued Peete, pel vbater PO pee ite 6 0.9 4 be eH) 28 MK yt id i @ + ain "ane 
Seong aovn remereeae <7 ee pom aru) git Oh Mwyet ees yr me ANTSE UN fey Le ali Ub 4 Bie a ge Be . ve ' q 
perm © Evateee lees 8 athe! OC6FE ve ’ epegegn UE OTs {Peet Sete Aa LY bap tue yp eeu vue verre o i a er of o> of 4 ' t ss ‘ : wala 
— dorrtsee & OE OR Fe oP DY FB OrSee 91 Oe B® ow OF Cyl ered yeu: hu ayes? >t eV as i saye poe’ paced estes ens FogeG ge soley ® tin © ow tptte et ' ‘ ‘ A 
9 ergs a? Be enty e200 w ee Pee 5? Tok) 1 prea eng wong yy gstater Rogie lene © otplapete ty weld Sere! uel ears etary ryagioreeebapeys at | a tos OF a LU ¥ “ Le *U ' ' m0, : ‘ 1 
mo og qicvety, ~gierepocngylocy BOqy eee 1 4¢rs J 06 =o gy ye 8-92 get mer sa fF Sa oF Eh ps pe orm Kier qivrw p? 188! are oor © paves Fo 4 8 esas & of ' WwW ' ' oot aon ;. 
ompitte wee td sath adie hides ded din giecayee ens cn we Oye: se Cioes @ thtete gl Bflew ns Wmsepee 5 ‘ S adatat See es Le oD ate’ ¥ . . as ‘ Teak bel) ' 
Spuelemaie - oo wepop gvere wns leoy ) everere s pie —@ oot esha eg mh BbeP ers getteit wets ee Ch he v ay 2 ‘ 19. t Als Cea Je ' 
- oey ng! gr peer wir ao FO wr? Ot doretee’l 41 Oe gerents  ehoy epetdts! qeae Oreepaiee peter ft wt vd oth 1 oo. clryeq s  penghe “en ab ' 1 1"s tee . ‘ ' 
my renee fa Or DEO OG Cate OFT 909? 12 YB, PIP raqzeie? @yvt ele aiyecs $f prpaesi ofiy ; 8 i sae at 4 | i } ; ’ 
papepetet ary og ge aragee lap 8 ox) 9° OFer-ep ¢ ay pesqeciys seehelgenaietss f\esoe: Ft vo" 8 apts « ot ip st @ ee feo ii = t ; 
a purse ew MOET HA sebty © BSP Ww sles *¢ Oy Or ee OP aa a¢ F ', sue pe ome ‘ : 
td a de 97 OF VEE GI % o ler sebaers o 2; gem need 5 10 Oe os Gt aes ' ‘ - aL y ‘ 
on ve <b wer Py vet Marys, ck oof eet wad) err Ft * ets va O¢ Pf oth atared op e .! > “be ayn ’ ’ . was ‘ 
owner ge yews werervas gtr tmpe Ueargr te 1997 fei Hy & ASH A ade 10q G78 1° Th, eh Oh 4% 0 ’ aa ' 
" B? 60GD9 & O98 GHD EH POT eye pr ergra o% 4 Teed aPhebay, 1 geeere Ads 8 MW ¢% ¢ rf ¥ = : 
Qreres eco Poery OF WTIES owe wee ty wie SRASUIIZ0 ay yP pay wy it. giyp! vers agus ‘ pre q ees a bs os 
2GD O ety Gin Ge? « POLES? EH * ayy OTE TOS TO  — jtemiebs. ¢ 0) M8Gse8 ay #9 ‘ c? a ' ' : . ie 
qarsae eves woreemry ws Ti 5 : wd been + i, georee een) 1 S098Re- 6 eas joiey ot to® 8 berregre © ‘ f ge ' ° ‘ ’ 1 
re 9 ee Oy Lied Vd =te Ore. 19 Oe | ei dat Me Bed pp ert °°. THae 14h Ee gtardeye ' ij ‘ o ¥ if 1 4% 1 1 a ‘ 
ws 9 oPemPE! Pe TTEHNE AG: 8 orphe osese ies ieee TROT FOS ee, Sue ee ia ’ it ae AN Die ast: ‘ ' : 
de Paty co Qageryt green eet me we a omielom yrgie’s etw 1Gewep Fe, Preps Te a hd en Sa Re, e 7 Ct oe ‘ofa ¥ a4 . 1 « 7 1 
ore 25 0 TP Pad we Het HF ETE P ettg Veanese ag 81 9 bIR FE IG Hy ® 8 Per ETE “EF? Sgoawen fate g'R-0 4 Of Fee oy Se » “dite a is ' ‘. ott t ' “ 1 ' ; 
mo Oeh 9 GwOee CUFT) WOE) ET ot AE eye sever pyees 0a" bGetrrys vt eh qewtes epecey aQheardgsiper a aoe ‘her! foi besiet apreres “oN i eee ‘ ar J i ' as 
qe ar Oe OTT TOS o! 85 ast HH @ of O7ET OH 9108F-OMOst vw gins Sirs verte © Ue ee ite yt, gene suf ed, sf inng 8 j 15 1 & re oe | aiita : Ne 
aod go 9 EP POE epeen’) 06 tr GS 18 & He totes emmy erberstssemeyis v Vet! © : BEY PHO Tt pa F ib, é*e ‘ as : 
er POAT OT SHOUTS poeredere ss 19 arco 81h) wumiwees ap? bam t> . Sse) fo 398. Sete. Ore LEE ae we $n, 8 : ' 
Op tg PORT OTT? PF W970 w GI ETE CT ie abddebe hat Gerareey, ¥ iy afl gst © teh ge le o teh og : i’ a a r os ¢ P é 
re Or AS ahed eed wep iy ee ve a gtorm frog ry low eRe wees tt 7 x fre ye. a8 yy? 6 sgt 8 aegt tbe ¢ be ae a8 I ' caren 
ae 41 00 OY, ETGSy Sore” a00 08 re , * ow, bn Ba Wi bebe 98 O14. 048 FO By gti 210 lw ee y tre wieme «OF 4 i ‘ ¢ F : as ; 
ccstare QTE CET TEN Gras etre oe US ey & yeeg ete %@ ereigien: Ty Qsehgrar thee Orde O45 0 ge ge M, thee Oe q 2 «f Oa en ‘ ; 
oer: ott RET OFUT ore a ake i "a" rer Soro FY et eet O1 peg re tata if! a fk wl efns #34 ' . m «8 Fi ‘ ies ot . ' ‘ 
wow serwat* ay oF Baie . uae apt VTe, oe WP e? ¢ spite rigirdad 9%! s 8 1 { 1 ' ' 
pe para tntIty pod oh ot ET oA) ereer er oyivigr 00 wet sirerte! { nf utki hie Pri peLarigae Aft 9 2 ae an rte ; > eee ; ? 
pe peewee of cegerry ow wre 0 Fe 0 we yeruiaeie we clgegh er 3 aS qper pipe” fey Baad gine! price of 0i8 7) t . ' a ae E i : : 
Kage cin 2 enega ery ae we os Derg ter Pe Pirse erase ry © GH Regb ge a2 F Sedgiat Ff yee werep wha ct Po Ghar Cane 8 ss 8 t¢ ' ‘ ej be a a : 
pes weewwrRe Pty Ya oe Tighe CAG So 008 nyeesont of Tatgeasous p veyten fF ay & > bel Wises) 0 ee F 4 : a i 
me porter woee wan vi gr vette oe ¢ pet 9894 @ “etn RF em pte tet tov |G Cog ti 6 i 4 vi ee q M ; 
1o@e@re 2 19 8 17 ¢ hee grote Pon pre Ben fore ft ae sev) Meek Fw i 4 Pe | A ar P ' 
arewre eer gmeo earn eet ayrirecy FLL ip ak +thipd p o ty Sousspytesne Vro Pye ts ¥ » sip Fase @ r) ’ “8 
orn} oer € Aart KO wae paw qivissenpterte yeeros ys pore rely tp e ' ° " ® t 
rare ens send bed O80 mp HTS AE WES perore eigen, oe F Oe Py ela bel Rs ¢ ‘ ° ‘ ° , 
y 3x Tree te tre weg? ee Qyie ays Ol Qed erp t! wee a v.oth geet F ‘ B Ore ‘ ae ie ° 
) rpes t ee even AY avn ¥ _gegiey ata e a ame as ' et er ' ‘ 2 ; te 
eed bd} yh op areErras 1etgy Shit a ae | o ogist €% 4G I Os Le wet ai . e . 8 e é 
were eh Heese wir! my Pais «6655 e awl b 8 te i e¢o8 iW Y ' e st ’ ‘ 
wee i Sere es etetaters OSG! Wivegeee 60” Fry wl of ¥ t se.u4 ' 168 P ‘ ' 
wach y teres’ eugpet 0 800, Wied STO HWY & ne be bef fy ee be D o @! 1 oo ‘ aC , : 
oe et 4 4 2, Cri 4 o:e9 0.6 69 OO ote lore gt egqeg a he ae we NE 1 i a ae) I ; ' 
spam rackets Foo es @ oy OF Bye ere eek ose ee fee ai we Og at agi a a ¥, te v i gti Py at % + ‘ jee Ol ‘ ‘ 
Xl. Lead Hote sjgrtse b VTS YE weesergiiert y™% = Peegwgt je Me soe ape aS ? e the e 1 ' cu fe 
oh ige SLA ed ded meer rraret yo gr ye Oe ef qa ¢. te Fe ie 4 fe t Ono oe a 9 : 
wre nln sf , ef ued warty oe ehreenahaere ecg ccm fy v wth | ’ tole 1 5 9 : gle ¢ ‘ a ty ‘ 
wee 1e <eepayy- rier Fe Ee eet in 8 se tegé ¢h ‘ ‘ ' ‘ ‘ : : ; 
Oe, pag dh phate on, gle Toes | recy hrarGwstie @ we oe oc ge fae Ba a qua ft tte 4 ’ ete : A a 
’ Ty i wl etd, od sysenty 7m erha Orders - = Fe Fe PT oe fe BU e +) ’ My 
UY it pe Oo erate peerqn® oq 7 8 @ FO -88 be bP ’ @ ester? ‘om oo 66 y e *8y Ret @ er es? ‘ fe es 7 F ® 
729 Wer RO Heese o es tee fey de ‘ aa hy ma a eee a a! ae | y) as P e 
ee ete dt aark Pol by SIO Drovers? i ’ Ce i a ha | r , Cin 8 fe. > ad Felte & r a ; 
a OB r ’ t a 1 1 ‘ 
_ - : 





